+3 votes
2 views
in Python by (1.3k points)
edited by

What exactly is the difference between groupby("x").count and groupby("x").size in Pandas?

2 Answers

+4 votes
by (13.2k points)

IN PANDAS

SIZE-

DataFrame.size

This function will return the size of DataFrame.

Example -

>>> s = pd.name({'s': 1, 'h': 2, 'i': 3, ā€˜vā€™:4}

>>> s.size

4

COUNT-

DataFrame.count

This function counts all the non- NA values of the DataFrame.

NA values are -  None, NaN, NaT.

EXAMPLE -

Constructing DataFrame from a dictionary:

>>> df = pd.DataFrame({"Name":

...                   ["shivangi", "sakshi", "aditi", "aditya", "vanshika"],

...                   "Sex": [F, np.nan, F, M, F],

...                   "Age": [21, 28, np.nan, 30, 46)

>>> df

  Name            Sex Age

0    shivangi      F 21

1    sakshi         NaN 28

2   aditi              F NaN

3    aditya          M 30

4    vanshika     F 46

The NA values will not be counted :

>>> df.count()

Name    5

Sex      4

Age      4

So, basically size will count all values including Nan, whereas count will count all values excluding Nan.

0 votes
by (67.1k points)

You can understand the difference between size and count in pandas by using the following code:-

grouped = df.groupby('a')

grouped.count()

Out[197]: 

   b  c

a      

0  2  2

1  1  1

2  2  3

grouped.size()

Out[198]: 

a

0    2

1    1

2    3

dtype: int64

You can use the following video tutorials to clear all your doubts:-

...