Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Python by (18.4k points)

The dataset which I have is the Oscar winner dataset, It consists of columns such as winner name, award, date of birth, place of birth and year. I want to know how many rows are filled per year, for example for the year 2005 we have the winner of the best actor and best director and for the year 2006 we have a best-supporting actor as the winner, I wanted the results to be as follows:

year_of_award number of rows

2005 2

2006 1

It may look simple, but I am not able to get it, most of the post which I found recommended to use the combination of coun() and group by, I have tried writing the code but I am getting number of rows from columns, so I filled the year and other 4 columns with number of rows by coding as shown below.

df.groupby(['year_of_award']).count()

How can I get just the year and the number of rows?

1 Answer

0 votes
by (36.8k points)

To do this you can use the groupby method and pass in a list of columns to group by and then you can use the aggregate method to aggregate the grouped values based on the count of values in award column

df.groupby(['year_of_award']).agg(number_of_rows=('award': 'count'))

else

df.groupby(['year_of_award']).agg({'award': 'count'}).rename(columns={'count': 'number_of_rows'})

The above code mentioned will work with the pandas 0.25 and above versions. learn Data Science with Python for more knowledge.

Browse Categories

...