Back

Explore Courses Blog Tutorials Interview Questions
0 votes
1 view
in Data Science by (17.6k points)

I have some N/A value in my dataframe

df = pd.DataFrame({'A':[1,1,1,3],

              'B':[1,1,1,3],

              'C':[1,np.nan,3,5],

              'D':[2,np.nan, np.nan, 6]})

print(df)

    A   B   C   D

0   1   1   1.0 2.0

1   1   1   NaN NaN

2   1   1   3.0 NaN

3   3   3   5.0 6.0

How can I fill in the n/a value with the mean of its previous non-empty value and next non-empty value in its column? For example, the second value in column C should be filled in with (1+3)/2= 2

Desired Output:

    A   B   C   D

0   1   1   1.0 2.0

1   1   1   2.0 4.0

2   1   1   3.0 4.0

3   3   3   5.0 6.0

Thanks!

1 Answer

0 votes
by (41.4k points)

1.For replacing NaN’s, use ffill and bfill by forward and back filling.

2.After that, concat and groupby by index with aggregate mean:

df1 = pd.concat([df.ffill(), df.bfill()]).groupby(level=0).mean()

print (df1)

   A  B   C D

0  1 1  1.0 2.0

1  1 1  2.0 4.0

2  1 1  3.0 4.0

3  3 3  5.0 6.0

Detail:

print (df.ffill())

   A  B   C D

0  1 1  1.0 2.0

1  1 1  1.0 2.0

2  1 1  3.0 2.0

3  3 3  5.0 6.0

print (df.bfill())

   A  B   C D

0  1 1  1.0 2.0

1  1 1  3.0 6.0

2  1 1  3.0 6.0

3  3 3  5.0 6.0

If you wish to learn more about how to use python for data science, then go through data science python programming course by Intellipaat for more insights.

Browse Categories

...