Explore Courses Blog Tutorials Interview Questions
0 votes
in Machine Learning by (19k points)

I have a multi class classification problem and my dataset is skewed, I have 100 instances of a particular class and say 10 of some different class, so I want to split my dataset keeping ratio between classes, if I have 100 instances of a particular class and I want 30% of records to go in the training set I want to have there 30 instances of my 100 record represented class and 3 instances of my 10 record represented class and so on.

1 Answer

0 votes
by (33.1k points)

Simply try this method:

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y,                                          stratify=y,                                             test_size=0.25)

Study the Datasets In Machine Learning for more.

If you want to master the course go through the Machine Learning Tutorial

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

29.3k questions

30.6k answers


104k users

Browse Categories