0 votes
1 view
in Data Science by (17.6k points)

I have a pandas dataframe which has sections looking like this (the zeros are NaN's):

...

     18  19  20

197  14  28  14

198  14   0  14

200   0   0   0

201   0   0   0

202  15  23  12

203  16   0  18

204   0   0   0

205   0   0   0

...

I need to fill rows that have NaN's on specific columns with values from the last row which has no NaN's on those columns.

In my example, rows 200,201 would be filled with values from row 197, and rows 204,205 from row 202.

LE: Rows 198&203 don't have NaN's on all columns I'm interested in, so they're left alone.

What would be a pythonic way of writing this?

1 Answer

0 votes
by (38.2k points)

1.Here, use forward filling by ffill with DataFrame.mask for creating missing values for all rows with at least one NaN by DataFrame.any 

2.After that, use DataFrame.all for filtering out rows with not all missing values in row.

df = df.replace(0, np.nan)

m = df.isnull()

df1 = df.mask(m.any(axis=1)).ffill()

df2 = df.mask(m.all(axis=1), df1, axis=1)

print (df2)

       18    19    20

197  14.0  28.0  14.0

198  14.0   NaN  14.0

200  14.0  28.0  14.0

201  14.0  28.0  14.0

202  15.0  23.0  12.0

203  16.0   NaN  18.0

204  15.0  23.0  12.0

205  15.0  23.0  12.0

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...