Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

My question is about using Pandas time series.

I have one file(Spots) that has pandas time series for a month's data with 7.5 seconds range. Example :

2016-11-01 00:00:00,0

2016-11-01 00:00:07.500000,1

2016-11-01 00:00:15,2

2016-11-01 00:00:22.500000,3

2016-11-01 00:00:30,4

The other file(Target) has just time information .

Example:

2016-11-01 00:00:05

2016-11-01 00:00:07

2016-11-01 00:00:23

2016-11-01 00:00:25

I want to check which spot does the target date time belong to: Output in above example :

2016-11-01 00:00:00,0 '\t' count of targets in this spot = 2

2016-11-01 00:00:07.500000,1 '\t' count of targets in this spot = 0

2016-11-01 00:00:15,2 '\t' count of targets in this spot = 0

2016-11-01 00:00:22.500000,3 '\t' count of targets in this spot = 0

2016-11-01 00:00:30,4 '\t' count of targets in this spot = 2

Thank you so much in advance. Kinda let me know if this is clear otherwise I can try to explain more.

1 Answer

0 votes
by (41.4k points)

The basic difference between size and count is that size includes NaN values and count does not include any NaN values.

Here is an example to illustrate the difference:

In [46]:

df = pd.DataFrame({'a':[0,0,1,2,2,2], 'b':[1,2,3,4,np.NaN,4], 'c':np.random.randn(6)})

df

Out[46]:

   a   b       c

0  0 1  1.067627

1  0 2  0.554691

2  1 3  0.458084

3  2 4  0.426635

4  2 NaN -2.238091

5  2 4  1.256943

In [48]:

print(df.groupby(['a'])['b'].count())

print(df.groupby(['a'])['b'].size())

a

0    2

1    1

2    2

Name: b, dtype: int64

a

0    2

1    1

2    3

dtype: int64 

Related questions

0 votes
2 answers
0 votes
1 answer
0 votes
1 answer
...