Back

Explore Courses Blog Tutorials Interview Questions
0 votes
1 view
in Data Science by (17.6k points)

Working with census data, I want to replace NaNs in two columns ("workclass" and "native-country") with the respective modes of those two columns. I can get the modes easily:

mode = df.filter(["workclass", "native-country"]).mode()

which returns a dataframe:

workclass native-country

0 Private United-States

However,

df.filter(["workclass", "native-country"]).fillna(mode)

does not replace the NaNs in each column with anything, let alone the mode corresponding to that column. Is there a smooth way to do this?

1 Answer

0 votes
by (41.4k points)

You can simply use this line of code.

cols = ["workclass", "native-country"]

df[cols]=df[cols].fillna(df.mode().iloc[0])

or instead of fillna(df.mode().iloc[0]), you can use  fillna(mode.iloc[0])

Example:

import pandas as pn

df={

    'P3': [7,9,9,9,3],

    'P2': [8,8,9],

    'P1': [8,9,9],

}

df=pn.DataFrame.from_dict(d,orient='index').transpose()

Then df is

    P3  P2   P1

0   7   8    8

1   9   8    9

2   9   9    9

3   9  NaN   NaN

4   3  NaN   NaN

After this,

l=df.filter(["P1", "P2"]).mode()

df[["P1", "P2"]]=df[["P1", "P2"]].fillna(value=l.iloc[0])

we get that df is

     P3   P2  P1

0   7   8    8

1   9   8    9

2   9   9    9

3   9   8    9

4   3   8    9

If you want to be build successful data science career then enroll for best data science certification.

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

28.4k questions

29.7k answers

500 comments

94.1k users

Browse Categories

...