Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Machine Learning by (19k points)

I have a dataset where the classes are unbalanced. The classes are either '1' or '0' where the ratio of class '1':'0' is 5:1. How do you calculate the prediction error for each class and the rebalance weights accordingly in sklearn with Random Forest, kind of like in the following link: http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#balance

Answer:

For your problem, suppose if 1 class is represented 5 times, as 0 class is, and you balance classes distributions, then simply use:

sample_weight = np.array([5 if i == 0 else 1 for i in y])

It will assign a weight of 5 to all 0 instances and weight of 1 to all 1 instances. 

You can learn more about Random Forest and its working parameters from this blog. This blog explains the code with scikit learn to help easy understanding.

Hope this answer helps.

1 Answer

0 votes
by (33.1k points)

For your problem, suppose if 1 class is represented 5 times, as 0 class is, and you balance classes distributions, then simply use:

sample_weight = np.array([5 if i == 0 else 1 for i in y])

It will assign a weight of 5 to all 0 instances and weight of 1 to all 1 instances. 

You can learn more about Random Forest and its working parameters from this blog. This blog explains the code with scikit learns to help easy understanding.

Hope this answer helps.

Visit this Scikit Learn Tutorial to know more.

Browse Categories

...