Back

Explore Courses Blog Tutorials Interview Questions
0 votes
1 view
in Data Science by (19k points)

I am trying to groupby a column and compute value counts on another column.

import pandas as pd

dftest = pd.DataFrame({'A':[1,1,1,1,1,1,1,1,1,2,2,2,2,2], 

               'Amt':[20,20,20,30,30,30,30,40, 40,10, 10, 40,40,40]})

print(dftest)

dftest looks like

    A  Amt

0   1 20

1   1 20

2   1 20

3   1 30

4   1 30

5   1 30

6   1 30

7   1 40

8   1 40

9   2 10

10  2 10

11  2 40

12  2 40

13  2 40

perform grouping

grouper = dftest.groupby('A')

df_grouped = grouper['Amt'].value_counts()

which gives

   A  Amt

1  30     4

    20     3

    40     2

2  40     3

    10     2

Name: Amt, dtype: int64

what I want is to keep the top two rows of each group

Also, I was perplexed by an error when I tried to reset_index

df_grouped.reset_index()

which gives following error

df_grouped.reset_index()

ValueError: cannot insert Amt, already exists

1 Answer

0 votes
by (33.1k points)

Pandas.reset_index() function generates a new DataFrame or Series with the index reset. This is used where the index is needed to be used as a column. A parameter name in reset_index is needed because Series name is the same as the name of one of the levels of MultiIndex:

df_grouped.reset_index(name='count')

Another solution is to rename Series name:

print (df_grouped.rename('count').reset_index())

   A  Amt  count

0  1 30      4

1  1 20      3

2  1 40      2

3  2 40      3

4  2 10      2

The more common solution instead of value_counts is aggregate size:

df_grouped1 =  dftest.groupby(['A','Amt']).size().rename('count').reset_index()

print (df_grouped1)

   A  Amt  count

0  1 20      3

1  1 30      4

2  1 40      2

3  2 10      2

4  2 40      3

If you wish to learn Python, then check out this Python tutorial and  Python Course by Intellipaat.

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

28.4k questions

29.7k answers

500 comments

94k users

Browse Categories

...