Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Machine Learning by (19k points)

I'm using the MinMaxScaler model in sklearn to normalize the features of a model.

training_set = np.random.rand(4,4)*10

training_set

       [[ 6.01144787,  0.59753007, 2.0014852 ,  3.45433657],

       [ 6.03041646,  5.15589559, 6.64992437,  2.63440202],

       [ 2.27733136,  9.29927394, 0.03718093,  7.7679183 ],

       [ 9.86934288,  7.59003904, 6.02363739,  2.78294206]]


 

scaler = MinMaxScaler()

scaler.fit(training_set)    

scaler.transform(training_set)


 

   [[ 0.49184811,  0. , 0.29704831,  0.15972182],

   [ 0.4943466 ,  0.52384506, 1.        , 0. ],

   [ 0.        , 1. ,  0. , 1.       ],

   [ 1.        , 0.80357559,  0.9052909 , 0.02893534]]

Now I want to use the same scaler to normalize the test set:

   [[ 8.31263467,  7.99782295, 0.02031658,  9.43249727],

   [ 1.03761228,  9.53173021, 5.99539478,  4.81456067],

   [ 0.19715961,  5.97702519, 0.53347403,  5.58747666],

   [ 9.67505429,  2.76225253, 7.39944931,  8.46746594]]

But I don't want so use the scaler.fit() with the training data all the time. Is there a way to save the scaler and load it later from a different file?

1 Answer

0 votes
by (33.1k points)

I would suggest pickle and sklearn.externals.joblib. These are the most commonly used classes in machine learning. 

pickle lets you save models to a file or drop a model from a file.

For example:

from sklearn.externals import joblib 

joblib.dump(clf, 'my_dope_model.pkl') 

Hope this answer helps.
 

Browse Categories

...