This is my data
import numpy as np
import pandas as pd
data = {
'cases': [120, 100, np.nan, np.nan, np.nan, np.nan, np.nan],
'percent_change': [0.03, 0.01, 0.00, -0.001, 0.05, -0.1, 0.003],
'tag': [7, 6, 5, 4, 3, 2, 1],
}
cases percent_change tag
0 120.0 0.030 7
1 100.0 0.010 6
2 NaN 0.000 5
3 NaN -0.001 4
4 NaN 0.050 3
5 NaN -0.100 2
6 NaN 0.003 1
I want to create next cases' value as (next value) = (previous value) * (1+current per_change). Specifically, I want it to be done in rows that have a tag value less than 6 (and I must use a mask (i.e., df.loc for this row selection). This should give me:
cases percent_change tag
0 120.0 0.030 7
1 100.0 0.010 6
2 100.0 0.000 5
3 99.9 -0.001 4
4 104.9 0.050 3
5 94.4 -0.100 2
6 94.7 0.003 1
I have tried this:
df_index = np.where(df['tag'] == 6)
index = df_index[0][0]
df.loc[(df.tag<6), 'cases'] = (df.percent_change.shift(0).fillna(1) + 1).cumprod() * df.at[index, 'cases']
cases percent_change tag
0 120.000000 0.030 7
1 100.000000 0.010 6
2 104.030000 0.000 5
3 103.925970 -0.001 4
4 109.122268 0.050 3
5 98.210042 -0.100 2
6 98.504672 0.003 1