Explore Courses Blog Tutorials Interview Questions
0 votes
in Data Science by (17.6k points)

I'm confused about using cross cross_val_predict in a test data set.

I created a simple Random Forest model and used cross_val_predict to make predictions

from sklearn.ensemble import RandomForestClassifier

from sklearn.cross_validation import cross_val_predict, KFold

lr = RandomForestClassifier(random_state=1, class_weight="balanced", n_estimators=25, max_depth=6)

kf = KFold(train_df.shape[0], random_state=1)

predictions = cross_val_predict(lr,train_df[features_columns], train_df["target"], cv=kf)

predictions = pd.Series(predictions)

I'm confused on the next step here, How do I use is learnt above to make predictions on the test data set?

1 Answer

0 votes
by (41.4k points)

The answer to your question is that the model has to be trained with the fit method before it can be used to predict.

# Training the model[features_columns], train_df["target"])

  #making predictions

y_pred = lr.predict(test_df[feature_columns])

# Comparing the predicted y values to actual y values.

accuracy = (y_pred == test_df["target"]).mean()

If you wish to learn more about how to use python for data science, then go through data science python programming course by Intellipaat for more insights.

Browse Categories