Automatic detection of modelling issues#

skore can automatically detect common modeling pitfalls such as overfitting and underfitting. This example walks through the checks accessor: how to run checks, how to read the detected issues, and how to mute specific checks.

We use a purely non-linear regression target and deliberately pick models that fail in known ways:

  • a linear model that cannot capture the non-linearity → underfitting,

  • a single deep decision tree that memorizes the training set perfectly and fails to generalize → overfitting.

Setup#

The target is a product of trigonometric functions of the first two features: completely invisible to a linear model, yet easy to memorize for an unconstrained tree.

import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.tree import DecisionTreeRegressor

rng = np.random.default_rng(42)
n_samples = 500
X = rng.uniform(0, 1, (n_samples, 5))
y = np.sin(2 * np.pi * X[:, 0]) * np.cos(2 * np.pi * X[:, 1]) + rng.normal(
    0, 0.1, n_samples
)

linear = LinearRegression()
deep_tree = DecisionTreeRegressor(random_state=42)

Calling summarize() explicitly#

Every report exposes a checks accessor which provides access to several methods:

Let’s use summarize() to see what issues can be found for the linear model.

LinearRegression()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Please enable javascript

The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").

1 issue(s), 1 tip(s), 1 passed, 0 ignored.


linear_report.checks.summarize()


linear_report.metrics.summarize(data_source="both").frame()
LinearRegression (train) LinearRegression (test)
Metric
0.001906 -0.015818
RMSE 0.522214 0.504156
MAE 0.423723 0.406739
MAPE 1.426344 1.032250
Fit time (s) 0.000969 0.000969
Predict time (s) 0.000134 0.000258


The linear model is flagged for underfitting: its scores are on par between train and test, and not significantly better than a dummy baseline.

Let’s now inspect the deep tree model.

tree_report = evaluate(deep_tree, X, y)
tree_report.checks.summarize()


tree_report.metrics.summarize(data_source="both").frame()
DecisionTreeRegressor (train) DecisionTreeRegressor (test)
Metric
1.000000 0.783887
RMSE 0.000000 0.232540
MAE 0.000000 0.180261
MAPE 0.000000 1.052768
Fit time (s) 0.002173 0.002173
Predict time (s) 0.000217 0.000138


The deep tree is flagged for overfitting: it achieves a perfect score on train but degrades on test. For this model, the tip about coefficients is not applicable so it is not reported.

Ignoring specific checks#

Each check has a stable code (e.g. SKD001, SKD002). You can mute individual checks per call:

tree_report.checks.summarize(ignore=["SKD001"])


Or globally, so that every subsequent summarize() call skips them:

import skore

with skore.configuration(ignore_checks=["SKD001"]):
    checks_summary = tree_report.checks.summarize()
checks_summary


Checks on a CrossValidationReport#

When splitter is an integer, evaluate() returns a CrossValidationReport. Checks aggregate issues across folds.

cv_report = evaluate(deep_tree, X, y, splitter=5)
cv_report.checks.summarize()


Checks on a ComparisonReport#

Passing a list of estimators returns a ComparisonReport. Issues are grouped by sub-report.



Total running time of the script: (0 minutes 0.835 seconds)

Gallery generated by Sphinx-Gallery