Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Machine Learning by (18.4k points)
I am eager to know about the data science and learning data science in python so I am confused with these two topics mainly test and train, in some programs the train and test dataset is separate and in some programs the train and test data is been combined. please suggest me.

1 Answer

0 votes
by (36.8k points)
edited by

Train Dataset: The name itself suggests “train” so the question arises train what? simply nothing but train the model, train dataset is the subset of the dataset, usually the training dataset is larger than the test data set, to train the model we need to use more data so that the model can learn better.

Test dataset: As the name suggests “test” so the question arises test what? Simply nothing but test the model which is been trained using the training dataset, the training dataset is also the subset of the dataset, Usually, the test dataset is comparatively less than the training dataset, test dataset can be used to calculate accuracy

Example:

sample=sample.split(data,SplitRatio = 0.65)

train=subset(data,sample==T)

test=subset(data,sample==F)

Hear in the above example, we have split the dataset, split Ratio represents the percentage of the dataset to be split, we have split 65% of dataset as train dataset and the remaining percentage is 35% dataset as the test dataset.
 

Learn Data Science courses to know more about Data Science.

Browse Categories

...