What does RepeatedKFold actually mean?

1 Answer

answered Jul 12, 2019 by Shlok Pandey (41.4k points)

You can see the same effect by calling KFolds.split() n_repeats times in a loop.

Example:

X = np.array([[1, 2], [3, 4], [1, 2], [3, 4]])
y = np.array([0, 0, 1, 1])
Then after running:
rkf = RepeatedKFold(n_splits=2, n_repeats=1, random_state=2652124)
for train_index, test_index in rkf.split(X):
print("TRAIN:", train_index, "TEST:", test_index)
Output:
TRAIN: [0 1] TEST: [2 3]
TRAIN: [2 3] TEST: [0 1]

Similar to what KFold(n_splits=2, random_state=2652124) would do.

Now, changing to n_repeats=2 will give output as:

TRAIN: [0 1] TEST: [2 3]
TRAIN: [2 3] TEST: [0 1]
TRAIN: [1 2] TEST: [0 3]
TRAIN: [0 3] TEST: [1 2]
And so on.

If you wish to learn more about how to use python for data science, then go through this data science python course by Intellipaat for more insights.

Browse Categories

...