Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

That is the difference between groupby("x").count and groupby("x").size in pandas ?

Does size just exclude nil ?

1 Answer

0 votes
by (41.4k points)
edited by

The main difference between size and count is that  size includes NaN values and count does not:

In [123]:

df = pd.DataFrame({'a':[0,0,1,2,2,2], 'b':[1,2,3,4,np.NaN,4], 'c':np.random.randn(6)})

df

Out[123]:

   a   b       c

0  0 1  1.067627

1  0 2  0.554691

2  1 3  0.458084

3  2 4  0.426635

4  2 NaN -2.238091

5  2 4  1.256943

In [148]:

print(df.groupby(['a'])['b'].count())

print(df.groupby(['a'])['b'].size())

a

0    2

1    1

2    2

Name: b, dtype: int64

a

0    2

1    1

2    3

dtype: int64 

If you want some hands on Data Science then you can watch this video tutorial on Data Science Project for Beginners.

31k questions

32.8k answers

501 comments

693 users

Browse Categories

...