Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (50.2k points)

I've seen a few variations on the theme of exploding a column/series into multiple columns of a Pandas dataframe, but I've been trying to do something and not really succeeding with the existing approaches.

Given a DataFrame like so:

    key       val

id

2   foo   oranges

2   bar   bananas

2   baz   apples

3   foo   grapes

3   bar     kiwis

I want to convert the items in the key series into columns, with the val values serving as the values, like so:

 

        foo        bar baz

id

2   oranges    bananas apples

3    grapes      kiwis NaN

I feel like this is something that should be relatively straightforward, but I've been bashing my head against this for a few hours now with increasing levels of convolution, and no success.

1 Answer

0 votes
by (108k points)

You can simply use set_index and unstack

In [1923]: df.set_index([df.index, 'key'])['val'].unstack()

Out[1923]:

key      bar baz      foo

id

2    bananas  apples oranges

3      kiwis   None grapes

Or, a simplified groupby:

In [1926]: df.groupby([df.index, 'key'])['val'].first().unstack()

Out[1926]:

key      bar baz      foo

id

2    bananas  apples oranges

3      kiwis   None grapes

If you are interested in learning Pandas and want to become an expert in Python Programming, then check out this Python Course and upskill yourself.

Browse Categories

...