Unbalanced classification using RandomForestClassifier in sklearn

Question

asked Jul 9, 2019 in Machine Learning by ParasSharma1 (19k points)

I have a dataset where the classes are unbalanced. The classes are either '1' or '0' where the ratio of class '1':'0' is 5:1. How do you calculate the prediction error for each class and the rebalance weights accordingly in sklearn with Random Forest, kind of like in the following link: http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#balance

Answer:

For your problem, suppose if 1 class is represented 5 times, as 0 class is, and you balance classes distributions, then simply use:

sample_weight = np.array([5 if i == 0 else 1 for i in y])

It will assign a weight of 5 to all 0 instances and weight of 1 to all 1 instances.

You can learn more about Random Forest and its working parameters from this blog. This blog explains the code with scikit learn to help easy understanding.

Hope this answer helps.

1 Answer

Anurag · Answer 1 · 2019-07-09T10:37:16+0000

For your problem, suppose if 1 class is represented 5 times, as 0 class is, and you balance classes distributions, then simply use:

sample_weight = np.array([5 if i == 0 else 1 for i in y])

It will assign a weight of 5 to all 0 instances and weight of 1 to all 1 instances.

You can learn more about Random Forest and its working parameters from this blog. This blog explains the code with scikit learns to help easy understanding.

Hope this answer helps.

Visit this Scikit Learn Tutorial to know more.

Unbalanced classification using RandomForestClassifier in sklearn

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources