Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

I have some N/A value in my dataframe

df = pd.DataFrame({'A':[1,1,1,3],

              'B':[1,1,1,3],

              'C':[1,np.nan,3,5],

              'D':[2,np.nan, np.nan, 6]})

print(df)

    A   B   C   D

0   1   1   1.0 2.0

1   1   1   NaN NaN

2   1   1   3.0 NaN

3   3   3   5.0 6.0

How can I fill in the n/a value with the mean of its previous non-empty value and next non-empty value in its column? For example, the second value in column C should be filled in with (1+3)/2= 2

Desired Output:

    A   B   C   D

0   1   1   1.0 2.0

1   1   1   2.0 4.0

2   1   1   3.0 4.0

3   3   3   5.0 6.0

Thanks!

1 Answer

0 votes
by (41.4k points)

1.For replacing NaN’s, use ffill and bfill by forward and back filling.

2.After that, concat and groupby by index with aggregate mean:

df1 = pd.concat([df.ffill(), df.bfill()]).groupby(level=0).mean()

print (df1)

   A  B   C D

0  1 1  1.0 2.0

1  1 1  2.0 4.0

2  1 1  3.0 4.0

3  3 3  5.0 6.0

Detail:

print (df.ffill())

   A  B   C D

0  1 1  1.0 2.0

1  1 1  1.0 2.0

2  1 1  3.0 2.0

3  3 3  5.0 6.0

print (df.bfill())

   A  B   C D

0  1 1  1.0 2.0

1  1 1  3.0 6.0

2  1 1  3.0 6.0

3  3 3  5.0 6.0

If you wish to learn more about how to use python for data science, then go through data science python programming course by Intellipaat for more insights.

31k questions

32.8k answers

501 comments

693 users

Browse Categories

...