0 votes
1 view
in Data Science by (17.6k points)

I want to use severals methods from StandardScaler from sklearn. Is it possible to use these methods on some columns/features of my set instead of apply them to the entire set.

For instance the set is data :

data = pd.DataFrame({'Name' : [3, 4,6], 'Age' : [18, 92,98], 'Weight' : [68, 59,49]})

   Age  Name  Weight

0   18     3      68

1   92     4      59

2   98     6      49

col_names = ['Name', 'Age', 'Weight']

features = data[col_names]

I fit and transform the data

scaler = StandardScaler().fit(features.values)

features = scaler.transform(features.values)

scaled_features = pd.DataFrame(features, columns = col_names)

       Name       Age    Weight

0 -1.069045 -1.411004  1.202703

1 -0.267261  0.623041  0.042954

2  1.336306  0.787964 -1.245657

But of course the names are not float but string and I don't want to standardize them. How can I apply the fit and transform functions only on the columns Age and Weight ?

1 Answer

0 votes
by (38.2k points)

Here is the best way to do this. 

We will use Column Transformer which applies transformers to a specified set of columns of an array or pandas DataFrame.

import pandas as pd

data = pd.DataFrame({'Name' : [3, 4,6], 'Age' : [18, 92,98], 'Weight' : [68, 59,49]})

col_names = ['Name', 'Age', 'Weight']

features = data[col_names]

from sklearn.compose import ColumnTransformer

from sklearn.preprocessing import StandardScaler

ct = ColumnTransformer([

        ('somename', StandardScaler(), ['Age', 'Weight'])

    ], remainder='passthrough')

ct.fit_transform(features)

If you want to learn data science in-depth then enroll for best data science training.

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...