TrainTestSplit#

class skore.TrainTestSplit(test_size=0.2, train_size=None, random_state=0, shuffle=True, stratify=None)[source]#

Single train-test split implementing the cross-validation protocol.

This splitter wraps sklearn.model_selection.train_test_split() and exposes split / get_n_splits so that it can be passed as the splitter argument of any skore or scikit-learn function.

Parameters:

test_sizefloat or int or None, default=0.2: Proportion (float) or absolute number (int) of samples for the test set. When None, the complement of train_size is used.
train_sizefloat or int or None, default=None: Proportion (float) or absolute number (int) of samples for the training set. When None, the complement of test_size is used.
random_stateint, RandomState instance or None, default=0: Controls the shuffling applied before splitting. Pass an int for reproducible output across multiple calls.
shufflebool, default=True: Whether to shuffle the data before splitting.
stratifyarray-like or None, default=None: If not None, data is split in a stratified fashion using this as the class labels.

See also

sklearn.model_selection.train_test_split(): Underlying scikit-learn helper used to generate the split.
evaluate(): Evaluate an estimator using this splitter via the splitter parameter.

Examples

>>> import numpy as np
>>> from skore import TrainTestSplit
>>> splitter = TrainTestSplit(test_size=0.3, random_state=0)
>>> X = np.arange(20).reshape(10, 2)
>>> for train, test in splitter.split(X):
...     train, test
(array([9, 1, 6, 7, 3, 0, 5]), array([2, 8, 4]))

get_n_splits(X=None, y=None, groups=None)[source]#: Return the number of splits (always 1).

split(X, y=None, groups=None)[source]#

Generate a single train-test split of indices.

Parameters:

Xarray-like: Training data used to determine the number of samples.
yarray-like or None, default=None: Ignored, present for API compatibility.
groupsarray-like or None, default=None: Ignored, present for API compatibility.

Yields:

trainndarray: The training set indices.
testndarray: The testing set indices.