Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (18.4k points)

My dataframe looks like this:

df[['reported_date', 'current_date']].head()

    reported_date        current_date

0   2016-01-15 13:58:21  2016-01-18 00:00:00

1   2016-01-14 10:51:24  2016-01-18 00:00:00

2   2016-01-15 15:17:35  2016-01-18 00:00:00

3   2016-01-17 17:07:10  2016-01-18 00:00:00

4   2016-01-17 17:08:23  2016-01-18 00:00:00

I can apply date subtraction directly like:

df[['reported_date', 'current_date']].head().apply(lambda x: x[1]-x[0], axis=1)

when I apply date_range to achive the interval between the days I got the following error

"ValueError: Length of values does not match the length of index"

when I apply date_range to achieve the interval between the days I got the following error

df[['reported_date', 'current_date']].head().apply(lambda x: pd.date_range(x[0], x[1], freq='B'), axis=1)

Can anyone tell me which is the right way to apply date_range() to two columns of DateTime?

Thank you in advance.

1 Answer

0 votes
by (36.8k points)

This pd.date_range doesn't return an interval. It returns a series of all DateTime objects between start and end. Since the start is reported_date here and is variable, while the end is current_date and is fixed, you get a series of different lengths, which don't fit nicely into a single (new) column.

The subtraction you have used before is giving you the interval between the dates. This proves there is no reason to use pd.date_range: x[1] - x[0] does exactly what you want.

If you are a beginner and want to know more about Data Science the do check out the Data Science course

Browse Categories

...