Converting a Pandas GroupBy output from Series to DataFrame

Question

asked Jun 26, 2019 in Machine Learning by ParasSharma1 (19k points)

I'm starting with input data like this

df1 = pandas.DataFrame( {
"Name" : ["Alice", "Bob", "Mallory", "Mallory", "Bob" , "Mallory"] , "City" : ["Seattle", "Seattle", "Portland", "Seattle", "Seattle", "Portland"] } )

Which when printed appears like this:

City Name
0 Seattle Alice
1 Seattle Bob
2 Portland Mallory
3 Seattle Mallory
4 Seattle Bob
5 Portland Mallory

Grouping is simple enough:

g1 = df1.groupby( [ "Name", "City"] ).count()

and printing yields a GroupBy object:

City Name Name City
Alice Seattle 1 1
Bob Seattle 2 2
Mallory Portland 2 2
Seattle 1 1

But what I want eventually is another DataFrame object that contains all the rows in the GroupBy object. In other words, I want to get the following result:

City Name Name City
Alice Seattle 1 1
Bob Seattle 2 2
Mallory Portland 2 2
Mallory Seattle 1 1

I can't quite see how to accomplish this in the pandas documentation. Any hints would be welcome.

2 Answers

Anurag · Answer 1 · 2019-06-26T13:42:37+0000

You can simply use .reset_index() method with .groupby() function for your problem.

For example:

In [1]: DataFrame({'count' : df1.groupby( [ "Name", "City"] ).size()}).reset_index()
Out[1]:
Name City count
0 Alice Seattle 1
1 Bob Seattle 2
2 Mallory Portland 2
3 Mallory Seattle 1

Or you can use:

In[2]: df1.groupby( [ "Name", "City"] ).size().to_frame(name = 'count').reset_index()
Out[2]:
Name City count
0 Alice Seattle 1
1 Bob Seattle 2
2 Mallory Portland 2
3 Mallory Seattle 1

Hope this answer helps.

Aarav · Answer 2 · 2019-08-03T16:07:23+0000

Simply, do this:

import pandas as pd
grouped_df = df1.groupby( [ "Name", "City"] )
pd.DataFrame(grouped_df.size().reset_index(name = "Group_Count"))

Here, grouped_df.size() pulls up the unique groupby count, and reset_index() method resets the name of the column you want it to be. After that, the pandas Dataframe() function is called upon to create DataFrame object.

Converting a Pandas GroupBy output from Series to DataFrame

2 Answers

Related questions

Browse Categories