Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (18.4k points)

I have a pandas data frame which looks like this:

            EIN                                    file_num

0      10043280                          [2748, 3010, 4410]

1      10391479    [217, 829, 1753, 3131, 4376, 7428, 8048]

2      10430261                [362, 531, 3788, 4851, 5680]

3      10564355              [1165, 2117, 3498, 5101, 5666]

4      10589927  [1128, 2886, 3225, 4158, 5924, 6811, 7953]

...         ...                                         ...

1592  980634789              [5095, 5653, 5800, 6750, 8133]

1593  986001141                          [4864, 6973, 7147]

1594  990078306        [1154, 2011, 3554, 4619, 5640, 6353]

1595  990170479  [1391, 2783, 3798, 5459, 6115, 7348, 8116]

1596  990317895                    [4882, 5730, 7083, 7847]

[1597 rows x 2 columns]

As you can see, each EIN has multiple files. I want to expand my data frame so each file has its row, something like this:

            EIN      file_num

0      10043280          2748

1      10043280          3010

2      10043280          4410

Any help will be appreciable. 

1 Answer

0 votes
by (36.8k points)

You can use explode function as shown below:

df = df.explode('file_num')

Learn Python for Data Science Course to improve your technical knowledge. 

Browse Categories

...