Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

An irregular time series data is stored in a pandas.DataFrame. A DatetimeIndex has been set. I need the time difference between consecutive entries in the index.

I thought it would be as simple as

data.index.diff()

but got

AttributeError: 'DatetimeIndex' object has no attribute 'diff'

I tried

data.index - data.index.shift(1)

but got
ValueError: Cannot shift with no freq

I do not want to infer or enforce a frequency first before doing this operation. There are large gaps in the time series that would be expanded to large runs of nan. The point is to find these gaps first.

So, what is a clean way to do this seemingly simple operation?

1 Answer

0 votes
by (41.4k points)

 sex  age      name

0   M   22     Sami

1   F    25    Mary

2   M   19     Sourav

3   F   18     Sakshi

4   F   32     Pratyush

There are some good reasons to use .query() method.

It is much shorter and cleaner compared to boolean indexing:

 df.query("20 <= age <= 30 and sex=='F'")

the above line will give the result as:

  sex  age     name

1   F   25      Mary

 df[(df['age']>=20) & (df['age']<=30) & (df['sex']=='F')]

   sex  age     name

1   F   25      Mary

PS there are also some disadvantages:

1.We can't use .query() method for columns containing spaces or columns that consist only from digits

2.not all functions can be applied or in some cases we have to use engine='python' instead of default engine='numexpr' 

If you want to be build successful data science career then enroll for best data science certification.

Browse Categories

...