TrainTestSplit#
- class skore.TrainTestSplit(test_size=0.2, train_size=None, random_state=0, shuffle=True, stratify=None)[source]#
Single train-test split implementing the cross-validation protocol.
This splitter wraps
sklearn.model_selection.train_test_split()and exposessplit/get_n_splitsso that it can be passed as thesplitterargument of anyskoreorscikit-learnfunction.- Parameters:
- test_sizefloat or int or None, default=0.2
Proportion (float) or absolute number (int) of samples for the test set. When
None, the complement oftrain_sizeis used.- train_sizefloat or int or None, default=None
Proportion (float) or absolute number (int) of samples for the training set. When
None, the complement oftest_sizeis used.- random_stateint, RandomState instance or None, default=0
Controls the shuffling applied before splitting. Pass an int for reproducible output across multiple calls.
- shufflebool, default=True
Whether to shuffle the data before splitting.
- stratifyarray-like or None, default=None
If not
None, data is split in a stratified fashion using this as the class labels.
Examples
>>> import numpy as np >>> from skore import TrainTestSplit >>> splitter = TrainTestSplit(test_size=0.3, random_state=0) >>> X = np.arange(20).reshape(10, 2) >>> for train, test in splitter.split(X): ... train, test (array([9, 1, 6, 7, 3, 0, 5]), array([2, 8, 4]))
- split(X, y=None, groups=None)[source]#
Generate a single train-test split of indices.
- Parameters:
- Xarray-like
Training data used to determine the number of samples.
- yarray-like or None, default=None
Ignored, present for API compatibility.
- groupsarray-like or None, default=None
Ignored, present for API compatibility.
- Yields:
- trainndarray
The training set indices.
- testndarray
The testing set indices.