PredictionErrorDisplay#
- class skore.PredictionErrorDisplay(*, prediction_error, range_y_true, range_y_pred, range_residuals, data_source, ml_task, report_type)[source]#
Visualization of the prediction error of a regression model.
This tool can display “residuals vs predicted” or “actual vs predicted” using scatter plots to qualitatively assess the behavior of a regressor, preferably on held-out data points.
An instance of this class is should created by
EstimatorReport.metrics.prediction_error()
. You should not create an instance of this class directly.- Parameters:
- prediction_errorDataFrame
The prediction error data to display. The columns are - “estimator_name” - “split_index” (may be null) - “y_true” - “y_pred” - “residuals”.
- range_y_trueRangeData
Global range of the true values.
- range_y_predRangeData
Global range of the predicted values.
- range_residualsRangeData
Global range of the residuals.
- data_source{“train”, “test”, “X_y”}
The data source used to display the prediction error.
- ml_task{“regression”, “multioutput-regression”}
The machine learning task.
- report_type{“comparison-cross-validation”, “comparison-estimator”, “cross-validation”, “estimator”}
The type of report.
- Attributes:
- line_matplotlib Artist
Optimal line representing
y_true == y_pred
. Therefore, it is a diagonal line forkind="predictions"
and a horizontal line forkind="residuals"
.- errors_lines_matplotlib Artist or None
Residual lines. If
with_errors=False
, then it is set toNone
.- scatter_list of matplotlib Artist
Scatter data points.
- ax_matplotlib Axes
Axes with the different matplotlib axis.
- figure_matplotlib Figure
Figure containing the scatter and lines.
Examples
>>> from sklearn.datasets import load_diabetes >>> from sklearn.linear_model import Ridge >>> from skore import train_test_split >>> from skore import EstimatorReport >>> X, y = load_diabetes(return_X_y=True) >>> split_data = train_test_split(X=X, y=y, random_state=0, as_dict=True) >>> classifier = Ridge() >>> report = EstimatorReport(classifier, **split_data) >>> display = report.metrics.prediction_error() >>> display.plot(kind="actual_vs_predicted")
- plot(*, estimator_name=None, kind='residual_vs_predicted', data_points_kwargs=None, perfect_model_kwargs=None, despine=True)[source]#
Plot visualization.
Extra keyword arguments will be passed to matplotlib’s
plot
.- Parameters:
- estimator_namestr
Name of the estimator used to plot the prediction error. If
None
, we used the inferred name from the estimator.- kind{“actual_vs_predicted”, “residual_vs_predicted”}, default=”residual_vs_predicted”
The type of plot to draw:
“actual_vs_predicted” draws the observed values (y-axis) vs. the predicted values (x-axis).
“residual_vs_predicted” draws the residuals, i.e. difference between observed and predicted values, (y-axis) vs. the predicted values (x-axis).
- data_points_kwargsdict, default=None
Dictionary with keywords passed to the
matplotlib.pyplot.scatter
call.- perfect_model_kwargsdict, default=None
Dictionary with keyword passed to the
matplotlib.pyplot.plot
call to draw the optimal line.- despinebool, default=True
Whether to remove the top and right spines from the plot.
Examples
>>> from sklearn.datasets import load_diabetes >>> from sklearn.linear_model import Ridge >>> from skore import train_test_split >>> from skore import EstimatorReport >>> X, y = load_diabetes(return_X_y=True) >>> split_data = train_test_split(X=X, y=y, random_state=0, as_dict=True) >>> classifier = Ridge() >>> report = EstimatorReport(classifier, **split_data) >>> display = report.metrics.prediction_error() >>> display.plot(kind="actual_vs_predicted")
- set_style(**kwargs)[source]#
Set the style parameters for the display.
- Parameters:
- **kwargsdict
Style parameters to set. Each parameter name should correspond to a a style attribute passed to the plot method of the display.
- Returns:
- selfobject
Returns the instance itself.
- Raises:
- ValueError
If a style parameter is unknown.