Back

Explore Courses Blog Tutorials Interview Questions
0 votes
1 view
in Data Science by (17.6k points)

That is the difference between groupby("x").count and groupby("x").size in pandas ?

Does size just exclude nil ?

1 Answer

0 votes
by (41.4k points)
edited by

The main difference between size and count is that  size includes NaN values and count does not:

In [123]:

df = pd.DataFrame({'a':[0,0,1,2,2,2], 'b':[1,2,3,4,np.NaN,4], 'c':np.random.randn(6)})

df

Out[123]:

   a   b       c

0  0 1  1.067627

1  0 2  0.554691

2  1 3  0.458084

3  2 4  0.426635

4  2 NaN -2.238091

5  2 4  1.256943

In [148]:

print(df.groupby(['a'])['b'].count())

print(df.groupby(['a'])['b'].size())

a

0    2

1    1

2    2

Name: b, dtype: int64

a

0    2

1    1

2    3

dtype: int64 

If you want some hands on Data Science then you can watch this video tutorial on Data Science Project for Beginners.

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

28.4k questions

29.7k answers

500 comments

94.7k users

Browse Categories

...