Mapping Datetime raises KeyError

Question

asked Apr 9, 2020 in Data Science by blackindya (18.4k points)

I am new to data science and I am working on a data set which has the column named timestamp

df = pd.DataFrame({'timestamp': ['2019-10-22T15:13:36.000Z', '2019-07-16T02:13:36.000Z']})
timestamp
0 2019-10-22T15:13:36.000Z
1 2019-07-16T02:13:36.000Z

And I have a second data frame which consists of column name as Start Stop and datetimecolumn. The data frame is as follows:

L=[['2019-07-16T12:25:07.000Z', '2019-07-16T00:25:07.000000Z', '2019-07-16T13:25:07.000000Z'], ['2019-03-20T09:56:29.000Z','2019-03-19T21:56:29.000000Z','2019-03-20T10:56:29.000000Z'],['2019-08-28T08:26:03.000Z', '2019-08-27T20:26:03.000000Z','2019-08-28T09:26:03.000000Z']]
cinfo = pd.DataFrame(data=L, columns=['c_time', 'start', 'stop'])
c_time start \
0 2019-07-16T12:25:07.000Z 2019-07-16T00:25:07.000000Z
1 2019-03-20T09:56:29.000Z 2019-03-19T21:56:29.000000Z
2 2019-08-28T08:26:03.000Z 2019-08-27T20:26:03.000000Z
stop
0 2019-07-16T13:25:07.000000Z
1 2019-03-20T10:56:29.000000Z
2 2019-08-28T09:26:03.000000Z

Now I want to check the timestamp of my first dataset, if the timestamp is between the start and stop columns of the second dataset then I want to give the c_time value which is in the second dataset. If it doesn't match the stop and start columns then i want to set my c_time to NAN. Consider the example below for better understanding:

timestamp c_time
0 2019-10-22T15:13:36.000Z NaN
1 2019-07-16T02:13:36.000Z 2019-07-16T12:25:07.000Z

To get the results I have used the code as shown below:

import pandas as pd
df = pd.DataFrame({'timestamp': ['2019-10-22T15:13:36.000Z']})
L=[['2019-07-16T12:25:07.000Z', '2019-07-16T00:25:07.000000Z', '2019-07-16T13:25:07.000000Z'], ['2019-03-20T09:56:29.000Z','2019-03-19T21:56:29.000000Z','2019-03-20T10:56:29.000000Z'],['2019-08-28T08:26:03.000Z', '2019-08-27T20:26:03.000000Z','2019-08-28T09:26:03.000000Z']]
cinfo = pd.DataFrame(data=L, columns=['c_time', 'start', 'stop'])
cinfo.start=pd.to_datetime(cinfo.start)
cinfo.stop=pd.to_datetime(cinfo.stop)
cinterval = cinfo.set_index(pd.IntervalIndex.from_arrays(cinfo.start, cinfo.stop, closed='both'))['c_time']
df['c_time'] = pd.to_datetime(df['timestamp']).map(cinterval)

But I am getting an error:

KeyError: 1571757216000000000

Can anyone help me?

1 Answer

supriya · Answer 1 · 2020-04-09T08:27:58+0000

You can use the code below for getting the desired output:

import pandas as pd
df = pd.DataFrame({'timestamp': ['2019-10-22T15:13:36.000Z']})
L=[['2019-07-16T12:25:07.000Z', '2019-07-16T00:25:07.000000Z', '2019-07-16T13:25:07.000000Z'], ['2019-03-20T09:56:29.000Z','2019-03-19T21:56:29.000000Z','2019-03-20T10:56:29.000000Z'],['2019-08-28T08:26:03.000Z', '2019-08-27T20:26:03.000000Z','2019-08-28T09:26:03.000000Z']]
cinfo = pd.DataFrame(data=L, columns=['c_time', 'start', 'stop'])
cinfo.start=pd.to_datetime(cinfo.start)
cinfo.stop=pd.to_datetime(cinfo.stop)
cinterval = cinfo.set_index(pd.IntervalIndex.from_arrays(cinfo.start, cinfo.stop, closed='both'))['c_time']
df['c_time'] = pd.to_datetime(df['timestamp']).map(cinterval)
print(df)

Improve your knowledge in data science from scratch by click on the link Data Science

Mapping Datetime raises KeyError

Mapping Datetime raises KeyError

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Related questions

Browse Categories

Popular Courses

Top Tutorials

Top Articles

Top Interview Questions