0 votes
1 view
in Data Science by (17.6k points)

After applying Imputer.fit_transform() on my dataset I am losing the column names on the transformed data frame. Is there any way to impute it without losing column names??

1 Answer

0 votes
by (31.1k points)

Replace the values in the dataframe with the data returned from the Imputer.

If this is your dataframe:

import numpy as np

import pandas as pd

df = pd.DataFrame(data=[[1,2,3], 

                        [3,4,4],

                        [3,5,np.nan], 

                        [6,7,8],

                        [3,np.nan,1]],

                  columns=['A', 'B', 'C'])

Current df:

   A    B    C

0  1  2.0  3.0

1  3  4.0  4.0

2  3  5.0  NaN

3  6  7.0  8.0

4  3  NaN  1.0

Use this, if you are sending whole the df to Imputer:

df[df.columns] = Imputer().fit_transform(df)

For sending only some columns, use those columns only to assign the results:

columns_to_impute = ['B', 'C']

df[columns_to_impute] = Imputer().fit_transform(df[columns_to_impute])

Output will be:

     A    B    C

0  1.0  2.0  3.0

1  3.0  4.0  4.0

2  3.0  5.0  4.0

3  6.0  7.0  8.0

4  3.0  4.5  1.0

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...