PredictionErrorDisplay#

class skore.PredictionErrorDisplay(*, prediction_error, range_y_true, range_y_pred, range_residuals, data_source, ml_task, report_type)[source]#

Visualization of the prediction error of a regression model.

This tool can display “residuals vs predicted” or “actual vs predicted” using scatter plots to qualitatively assess the behavior of a regressor, preferably on held-out data points.

An instance of this class should be created by EstimatorReport.metrics.prediction_error(). You should not create an instance of this class directly.

Parameters:
prediction_errorDataFrame

The prediction error data to display. The columns are

  • estimator

  • split (may be null)

  • y_true

  • y_pred

  • residuals.

range_y_trueRangeData

Global range of the true values.

range_y_predRangeData

Global range of the predicted values.

range_residualsRangeData

Global range of the residuals.

data_source{“train”, “test”, “X_y”, “both”}

The data source used to display the prediction error.

ml_task{“regression”, “multioutput-regression”}

The machine learning task.

report_type{“comparison-cross-validation”, “comparison-estimator”, “cross-validation”, “estimator”}

The type of report.

Attributes:
facet_seaborn FacetGrid

FacetGrid containing the prediction error.

figure_matplotlib Figure

The figure on which the prediction error is plotted.

ax_matplotlib Axes

The axes on which the prediction error is plotted.

Examples

>>> from sklearn.datasets import load_diabetes
>>> from sklearn.linear_model import Ridge
>>> from skore import train_test_split
>>> from skore import EstimatorReport
>>> X, y = load_diabetes(return_X_y=True)
>>> split_data = train_test_split(X=X, y=y, random_state=0, as_dict=True)
>>> classifier = Ridge()
>>> report = EstimatorReport(classifier, **split_data)
>>> display = report.metrics.prediction_error()
>>> display.plot(kind="actual_vs_predicted")
frame()[source]#

Get the data used to create the prediction error plot.

Returns:
DataFrame

A DataFrame containing the prediction error data with columns depending on the report type:

  • estimator: Name of the estimator (when comparing estimators)

  • split: Cross-validation split ID (when doing cross-validation)

  • y_true: True target values

  • y_pred: Predicted target values

  • residuals: Difference between true and predicted values (y_true - y_pred)

Examples

>>> from sklearn.datasets import load_diabetes
>>> from sklearn.linear_model import Ridge
>>> from skore import train_test_split, EstimatorReport
>>> X, y = load_diabetes(return_X_y=True)
>>> split_data = train_test_split(X=X, y=y, random_state=0, as_dict=True)
>>> reg = Ridge()
>>> report = EstimatorReport(reg, **split_data)
>>> display = report.metrics.prediction_error()
>>> df = display.frame()
help()[source]#

Display display help using rich or HTML.

plot(*, subplot_by='auto', kind='residual_vs_predicted', despine=True)[source]#

Plot visualization.

Extra keyword arguments will be passed to matplotlib’s plot.

Parameters:
subplot_by{“auto”, “data_source”, “split”, “estimator”, None}, default=”auto”

The variable to use for creating subplots:

  • “auto” creates subplots by estimator for comparison reports, otherwise uses a single plot.

  • “data_source” creates subplots by data source (train/test).

  • “split” creates subplots by cross-validation split.

  • “estimator” creates subplots by estimator.

  • None creates a single plot.

kind{“actual_vs_predicted”, “residual_vs_predicted”}, default=”residual_vs_predicted”

The type of plot to draw:

  • “actual_vs_predicted” draws the observed values (y-axis) vs. the predicted values (x-axis).

  • “residual_vs_predicted” draws the residuals, i.e. difference between observed and predicted values, (y-axis) vs. the predicted values (x-axis).

despinebool, default=True

Whether to remove the top and right spines from the plot.

Examples

>>> from sklearn.datasets import load_diabetes
>>> from sklearn.linear_model import Ridge
>>> from skore import train_test_split
>>> from skore import EstimatorReport
>>> X, y = load_diabetes(return_X_y=True)
>>> split_data = train_test_split(X=X, y=y, random_state=0, as_dict=True)
>>> classifier = Ridge()
>>> report = EstimatorReport(classifier, **split_data)
>>> display = report.metrics.prediction_error()
>>> display.plot(kind="actual_vs_predicted")
set_style(*, policy='update', relplot_kwargs=None, perfect_model_kwargs=None)[source]#

Set the style parameters for the display.

Parameters:
policy{“override”, “update”}, default=”update”

Policy to use when setting the style parameters. If “override”, existing settings are set to the provided values. If “update”, existing settings are not changed; only settings that were previously unset are changed.

relplot_kwargsdict, default=None

Additional keyword arguments to be passed to seaborn.relplot() for rendering the scatter plot(s). Common options include palette, alpha, s, marker, etc.

perfect_model_kwargsdict, default=None

Additional keyword arguments to be passed to matplotlib.pyplot.plot() for drawing the perfect prediction line. Common options include color, alpha, linestyle, etc.

Returns:
selfobject

The instance with a modified style.

Raises:
ValueError

If a style parameter is unknown.

static style_plot(plot_func)[source]#

Apply consistent style to skore displays.

This decorator: 1. Applies default style settings 2. Executes plot_func 3. Calls plt.tight_layout() to make sure axis does not overlap 4. Restores the original style settings

Parameters:
plot_funccallable

The plot function to be decorated.

Returns:
callable

The decorated plot function.