Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

I've got a pandas DataFrame filled mostly with real numbers, but there is a few nan values in it as well.

How can I replace the nans with averages of columns where they are?

This question is very similar to this one: numpy array: replace nan values with average of columnsbut, unfortunately, the solution given there doesn't work for a pandas DataFrame.

1 Answer

0 votes
by (41.4k points)

Use DataFrame.fillna to fill the nan's:

In [27]: df 

Out[27]: 

          A         B         C

0 -0.166919  0.979728 -0.632955

1 -0.297953 -0.912674 -1.365463

2 -0.120211 -0.540679 -0.680481

3       NaN -2.027325  1.533582

4       NaN       NaN  0.461821

5 -0.788073       NaN       NaN

6 -0.916080 -0.612343       NaN

7 -0.887858  1.033826       NaN

8  1.948430  1.025011 -2.982224

9  0.019698 -0.795876 -0.046431

In [28]: df.mean()

Out[28]: 

A   -0.151121

B   -0.231291

C   -0.530307

dtype: float64

In [29]: df.fillna(df.mean())

Out[29]: 

          A         B         C

0 -0.166919  0.979728 -0.632955

1 -0.297953 -0.912674 -1.365463

2 -0.120211 -0.540679 -0.680481

3 -0.151121 -2.027325  1.533582

4 -0.151121 -0.231291  0.461821

5 -0.788073 -0.231291 -0.530307

6 -0.916080 -0.612343 -0.530307

7 -0.887858  1.033826 -0.530307

8  1.948430  1.025011 -2.982224

9  0.019698 -0.795876 -0.046431

If you are interested in learning Numpy then visit this Python Course by Intellipaat.

Browse Categories

...