Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

That is the difference between groupby("x").count and groupby("x").size in pandas ?

Does size just exclude nil ?

1 Answer

0 votes
by (41.4k points)
edited by

The main difference between size and count is that  size includes NaN values and count does not:

In [123]:

df = pd.DataFrame({'a':[0,0,1,2,2,2], 'b':[1,2,3,4,np.NaN,4], 'c':np.random.randn(6)})

df

Out[123]:

   a   b       c

0  0 1  1.067627

1  0 2  0.554691

2  1 3  0.458084

3  2 4  0.426635

4  2 NaN -2.238091

5  2 4  1.256943

In [148]:

print(df.groupby(['a'])['b'].count())

print(df.groupby(['a'])['b'].size())

a

0    2

1    1

2    2

Name: b, dtype: int64

a

0    2

1    1

2    3

dtype: int64 

If you want some hands on Data Science then you can watch this video tutorial on Data Science Project for Beginners.

Browse Categories

...