I am new to data science and I am working on a data set which has the column named timestamp
df = pd.DataFrame({'timestamp': ['2019-10-22T15:13:36.000Z', '2019-07-16T02:13:36.000Z']})
timestamp
0 2019-10-22T15:13:36.000Z
1 2019-07-16T02:13:36.000Z
And I have a second data frame which consists of column name as Start Stop and datetimecolumn. The data frame is as follows:
L=[['2019-07-16T12:25:07.000Z', '2019-07-16T00:25:07.000000Z', '2019-07-16T13:25:07.000000Z'], ['2019-03-20T09:56:29.000Z','2019-03-19T21:56:29.000000Z','2019-03-20T10:56:29.000000Z'],['2019-08-28T08:26:03.000Z', '2019-08-27T20:26:03.000000Z','2019-08-28T09:26:03.000000Z']]
cinfo = pd.DataFrame(data=L, columns=['c_time', 'start', 'stop'])
c_time start \
0 2019-07-16T12:25:07.000Z 2019-07-16T00:25:07.000000Z
1 2019-03-20T09:56:29.000Z 2019-03-19T21:56:29.000000Z
2 2019-08-28T08:26:03.000Z 2019-08-27T20:26:03.000000Z
stop
0 2019-07-16T13:25:07.000000Z
1 2019-03-20T10:56:29.000000Z
2 2019-08-28T09:26:03.000000Z
Now I want to check the timestamp of my first dataset, if the timestamp is between the start and stop columns of the second dataset then I want to give the c_time value which is in the second dataset. If it doesn't match the stop and start columns then i want to set my c_time to NAN. Consider the example below for better understanding:
timestamp c_time
0 2019-10-22T15:13:36.000Z NaN
1 2019-07-16T02:13:36.000Z 2019-07-16T12:25:07.000Z
To get the results I have used the code as shown below:
import pandas as pd
df = pd.DataFrame({'timestamp': ['2019-10-22T15:13:36.000Z']})
L=[['2019-07-16T12:25:07.000Z', '2019-07-16T00:25:07.000000Z', '2019-07-16T13:25:07.000000Z'], ['2019-03-20T09:56:29.000Z','2019-03-19T21:56:29.000000Z','2019-03-20T10:56:29.000000Z'],['2019-08-28T08:26:03.000Z', '2019-08-27T20:26:03.000000Z','2019-08-28T09:26:03.000000Z']]
cinfo = pd.DataFrame(data=L, columns=['c_time', 'start', 'stop'])
cinfo.start=pd.to_datetime(cinfo.start)
cinfo.stop=pd.to_datetime(cinfo.stop)
cinterval = cinfo.set_index(pd.IntervalIndex.from_arrays(cinfo.start, cinfo.stop, closed='both'))['c_time']
df['c_time'] = pd.to_datetime(df['timestamp']).map(cinterval)
But I am getting an error:
KeyError: 1571757216000000000
Can anyone help me?