Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Machine Learning by (19k points)

I have a multi class classification problem and my dataset is skewed, I have 100 instances of a particular class and say 10 of some different class, so I want to split my dataset keeping ratio between classes, if I have 100 instances of a particular class and I want 30% of records to go in the training set I want to have there 30 instances of my 100 record represented class and 3 instances of my 10 record represented class and so on.

1 Answer

0 votes
by (33.1k points)

Simply try this method:

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y,                                          stratify=y,                                             test_size=0.25)

Study the Datasets In Machine Learning for more.

If you want to master the course go through the Machine Learning Tutorial

31k questions

32.8k answers

501 comments

693 users

Browse Categories

...