ComparisonReport.get_predictions#
- ComparisonReport.get_predictions(*, data_source, response_method='predict', X=None, pos_label=<DEFAULT>)[source]
Get predictions from the underlying reports.
This method has the advantage to reload from the cache if the predictions were already computed in a previous call.
- Parameters:
- data_source{“test”, “train”, “X_y”}, default=”test”
The data source to use.
“test” : use the test set provided when creating the report.
“train” : use the train set provided when creating the report.
“X_y” : use the provided
Xandyto compute the metric.
- response_method{“predict”, “predict_proba”, “decision_function”}, default=”predict”
The response method to use to get the predictions.
- Xarray-like of shape (n_samples, n_features), optional
When
data_sourceis “X_y”, the input features on which to compute the response method.- pos_labelint, float, bool, str or None, default=_DEFAULT
The label to consider as the positive class when computing predictions in binary classification cases. By default, the positive class is set to the one provided when creating the report. If
None,estimator_.classes_[1]is used as positive label.When
pos_labelis equal toestimator_.classes_[0], it will be equivalent toestimator_.predict_proba(X)[:, 0]forresponse_method="predict_proba"and-estimator_.decision_function(X)forresponse_method="decision_function".
- Returns:
- list of np.ndarray of shape (n_samples,) or (n_samples, n_classes) or list of such lists
The predictions for each
EstimatorReportorCrossValidationReport.
- Raises:
- ValueError
If the data source is invalid.
Examples
>>> from sklearn.datasets import make_classification >>> from skore import train_test_split >>> from sklearn.linear_model import LogisticRegression >>> from skore import ComparisonReport, EstimatorReport >>> X, y = make_classification(random_state=42) >>> split_data = train_test_split(X=X, y=y, random_state=42, as_dict=True) >>> estimator_1 = LogisticRegression() >>> estimator_report_1 = EstimatorReport(estimator_1, **split_data) >>> estimator_2 = LogisticRegression(C=2) # Different regularization >>> estimator_report_2 = EstimatorReport(estimator_2, **split_data) >>> report = ComparisonReport([estimator_report_1, estimator_report_2]) >>> report.cache_predictions() >>> predictions = report.get_predictions(data_source="test") >>> print([split_predictions.shape for split_predictions in predictions]) [(25,), (25,)]