0 votes
1 view
in Data Science by (17.6k points)

I'm confused about using cross cross_val_predict in a test data set.

I created a simple Random Forest model and used cross_val_predict to make predictions

from sklearn.ensemble import RandomForestClassifier

from sklearn.cross_validation import cross_val_predict, KFold

lr = RandomForestClassifier(random_state=1, class_weight="balanced", n_estimators=25, max_depth=6)

kf = KFold(train_df.shape[0], random_state=1)

predictions = cross_val_predict(lr,train_df[features_columns], train_df["target"], cv=kf)

predictions = pd.Series(predictions)

I'm confused on the next step here, How do I use is learnt above to make predictions on the test data set?

1 Answer

0 votes
by (40.4k points)

The answer to your question is that the model has to be trained with the fit method before it can be used to predict.

# Training the model 

lr.fit(train_df[features_columns], train_df["target"])

  #making predictions

y_pred = lr.predict(test_df[feature_columns])

# Comparing the predicted y values to actual y values.

accuracy = (y_pred == test_df["target"]).mean()

If you wish to learn more about how to use python for data science, then go through data science python programming course by Intellipaat for more insights.

Welcome to Intellipaat Community. Get your technical queries answered by top developers !