Back
That is the difference between groupby("x").count and groupby("x").size in pandas ?
Does size just exclude nil ?
The main difference between size and count is that size includes NaN values and count does not:
In [123]:df = pd.DataFrame({'a':[0,0,1,2,2,2], 'b':[1,2,3,4,np.NaN,4], 'c':np.random.randn(6)})df
In [123]:
df = pd.DataFrame({'a':[0,0,1,2,2,2], 'b':[1,2,3,4,np.NaN,4], 'c':np.random.randn(6)})
df
Out[123]:
a b c0 0 1 1.0676271 0 2 0.5546912 1 3 0.4580843 2 4 0.4266354 2 NaN -2.2380915 2 4 1.256943
a b c
0 0 1 1.067627
1 0 2 0.554691
2 1 3 0.458084
3 2 4 0.426635
4 2 NaN -2.238091
5 2 4 1.256943
In [148]:print(df.groupby(['a'])['b'].count())print(df.groupby(['a'])['b'].size())a0 21 12 2Name: b, dtype: int64a0 21 12 3dtype: int64
In [148]:
print(df.groupby(['a'])['b'].count())
print(df.groupby(['a'])['b'].size())
a
0 2
1 1
2 2
Name: b, dtype: int64
2 3
dtype: int64
If you want some hands on Data Science then you can watch this video tutorial on Data Science Project for Beginners.
31k questions
32.8k answers
501 comments
693 users