Machine Learning: Move Threshold

Question

asked Jul 5, 2019 in Data Science by sourav (17.6k points)

I'm trying to solve a binary classification problem where 80% of the data belongs to class x and 20% of the data belongs to class y. All my models (AdaBoost, Neural Networks and SVC) just predict all data to be part of class x as this is the highest accuracy they can achieve.

My goal is to achieve a higher precision for all entries of class x and I don't care how many entries are falsely classified to be part of class y.

My idea would be to just put entries in class x when the model is super sure about them and put them in class y otherwise.

How would I achieve this? Is there a way to move the treshold so that only very obvious entries are classified as class x?

I'm using python and sklearn

Sample Code:

adaboost = AdaBoostClassifier(random_state=1)
adaboost.fit(X_train, y_train)
adaboost_prediction = adaboost.predict(X_test)
confusion_matrix(adaboost_prediction,y_test) outputs:
array([[ 0, 0],
[10845, 51591]])

1 Answer

Shlok Pandey · Answer 1 · 2019-07-06T05:12:43+0000

Use AdaBoostClassifier, with the help of this you can output class probabilities and then threshold them by using predict_proba:

adaboost = AdaBoostClassifier(random_state=1)
adaboost.fit(X_train, y_train)
adaboost_probs = adaboost.predict_proba(X_test) ##using predict_proba instead of predict
threshold = 0.8 # for example
thresholded_adaboost_prediction = adaboost_probs > threshold

If you want to know more about Machine Learning then watch this video:

If you want to learn data science in-depth then enroll for best data science training.

Machine Learning: Move Threshold

Please log in to add a comment.

Please log in to answer this question.

1 Answer

Please log in to add a comment.

Related questions