Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
3 views
in Data Science by (17.6k points)

Lets say I have a dataframe like this

    A   B

0   a   b

1   c   d

2   e   f 

3   g   h

0,1,2,3 are times, a, c, e, g is one time series and b, d, f, h is another time series. I need to be able to add two columns to the orignal dataframe which is got by computing the differences of consecutive rows for certain columns.

So i need something like this

    A   B   dA

0   a   b  (a-c)

1   c   d  (c-e)

2   e   f  (e-g)

3   g   h   Nan

I saw something called diff on the dataframe/series but that does it slightly differently as in first element will become Nan.

1 Answer

0 votes
by (41.4k points)

Use diff and pass -1 as the periods argument:

>>> df = pd.DataFrame({"A": [9, 4, 2, 1], "B": [12, 7, 5, 4]})

>>> df["dA"] = df["A"].diff(-1)

>>> df

   A   B  dA

0  9  12   5

1  4   7   2

2  2   5   1

3  1   4 NaN

[4 rows x 3 columns]

Related questions

0 votes
2 answers
0 votes
1 answer
0 votes
1 answer

31k questions

32.9k answers

507 comments

693 users

Browse Categories

...