0 votes
1 view
in Data Science by (17.6k points)

I was wondering if it is possible to groupby one column while counting the values of another column that fulfill a condition. Because my dataset is a bit weird, I created a similar one:

import pandas as pd

raw_data = {'name': ['John', 'Paul', 'George', 'Emily', 'Jamie'], 

            'nationality': ['USA', 'USA', 'France', 'France', 'UK'],     

            'books': [0, 15, 0, 14, 40]}  

df = pd.DataFrame(raw_data, columns = ['name', 'nationality', 'books'])

Say, I want to groupby the nationality and count the number of people that don't have any books (books == 0) from that country.

I would therefore expect something like the following as output:

nationality

USA      1

France   1

UK       0

I tried most variations of groupby, using filter, agg but don't seem to get anything that works.

Thanks in advance, BBQuercus :)

1 Answer

0 votes
by (38.5k points)
edited by

The below query will give you the required output.

df.books.eq(0).astype(int).groupby(df.nationality).sum()

Nationality

USA           1

France       1

UK             0

If you want some hands on Data Science then you can watch this video tutorial on Data Science Project for Beginners.

If you wish to learn about Data Science visit this Data Science Online Course.

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...