Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (41.4k points)
edited by

How do I access the corresponding groupby dataframe in a groupby object by the key? With the following groupby:

rand = np.random.RandomState(1)

df = pd.DataFrame({'A': ['foo', 'bar'] * 3,

                   'B': rand.randn(6),

                   'C': rand.randint(0, 20, 6)})

gb = df.groupby(['A'])

I can iterate through it to get the keys and groups:

In [11]: for k, gp in gb:

             print 'key=' + str(k)

             print gp

key=bar

     A         B   C

1  bar -0.611756  18

3  bar -1.072969  10

5  bar -2.301539  18

key=foo

     A         B   C

0  foo  1.624345   5

2  foo -0.528172  11

4  foo  0.865408  14

I would like to be able to do something like

In [12]: gb['foo']

Out[12]:  

     A         B   C

0  foo  1.624345   5

2  foo -0.528172  11

4  foo  0.865408  14

But when I do that (well, actually I have to do gb[('foo',)]), I get this weird pandas.core.groupby.DataFrameGroupBy thing which doesn't seem to have any methods that correspond to the DataFrame I want.

The best I can think of is

In [13]: def gb_df_key(gb, key, orig_df):

             ix = gb.indices[key]

             return orig_df.ix[ix]

         gb_df_key(gb, 'foo', df)

Out[13]:

     A         B   C

0  foo  1.624345   5

2  foo -0.528172  11

4  foo  0.865408  14  

but this is kind of nasty, considering how nice pandas usually is at these things.

What's the built-in way of doing this?

1 Answer

0 votes
by (41.4k points)

You can use the get_group method:

In [21]: gb.get_group('foo')

Out[21]: 

     A         B C

0  foo  1.624345   5

2  foo -0.528172  11

4  foo  0.865408  14

Browse Categories

...