Note
Go to the end to download the full example code.
Automatic detection of modelling issues#
skore can automatically detect common modeling pitfalls such as overfitting
and underfitting. This example walks through the diagnostics API: how to
trigger diagnostics, how to read the results, and how to mute specific checks.
We use a purely non-linear regression target and deliberately pick models that fail in known ways:
a linear model that cannot capture the non-linearity → underfitting,
a single deep decision tree that memorizes the training set perfectly and fails to generalize → overfitting.
Setup#
The target is a product of trigonometric functions of the first two features: completely invisible to a linear model, yet easy to memorize for an unconstrained tree.
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.tree import DecisionTreeRegressor
rng = np.random.default_rng(42)
n_samples = 500
X = rng.uniform(0, 1, (n_samples, 5))
y = np.sin(2 * np.pi * X[:, 0]) * np.cos(2 * np.pi * X[:, 1]) + rng.normal(
0, 0.1, n_samples
)
linear = LinearRegression()
deep_tree = DecisionTreeRegressor(random_state=42)
Calling diagnose() explicitly#
Every report exposes a diagnose() method.
Diagnostics are computed lazily and cached, so calling
diagnose() is always cheap after the first call.
from skore import evaluate
linear_report = evaluate(linear, X, y)
linear_report
Results
Investigate more with EstimatorReport.metrics.
| LinearRegression | |
|---|---|
| Metric | |
| R² | -0.015818 |
| RMSE | 0.504156 |
| Fit time (s) | 0.000950 |
| Predict time (s) | 0.000234 |
Use .help() for information on available functionality.
LinearRegression()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Parameters
| Feature 0 | Feature 1 | Feature 2 | Feature 3 | Feature 4 | Target | |
|---|---|---|---|---|---|---|
| 0 | 0.209 | 0.525 | 0.164 | 0.166 | 0.836 | -1.16 |
| 1 | 0.745 | 0.821 | 0.749 | 0.288 | 0.118 | -0.378 |
| 2 | 0.523 | 0.764 | 0.799 | 0.492 | 0.600 | 0.131 |
| 3 | 0.462 | 0.327 | 0.305 | 0.251 | 0.365 | -0.181 |
| 4 | 0.745 | 0.968 | 0.326 | 0.370 | 0.470 | -0.925 |
| 495 | 0.582 | 0.994 | 0.990 | 0.527 | 0.639 | -0.419 |
| 496 | 0.0435 | 0.181 | 0.237 | 0.249 | 0.571 | 0.213 |
| 497 | 0.119 | 0.937 | 0.895 | 0.186 | 0.323 | 0.677 |
| 498 | 0.779 | 0.135 | 0.536 | 0.514 | 0.858 | -0.500 |
| 499 | 0.0917 | 0.667 | 0.656 | 0.663 | 0.0198 | -0.272 |
Feature 0
Float64DType- Null values
- 0 (0.0%)
- Unique values
-
500 (100.0%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.503 ± 0.295
- Median ± IQR
- 0.505 ± 0.518
- Min | Max
- 0.00107 | 1.00
Feature 1
Float64DType- Null values
- 0 (0.0%)
- Unique values
-
500 (100.0%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.504 ± 0.287
- Median ± IQR
- 0.489 ± 0.502
- Min | Max
- 0.000568 | 0.994
Feature 2
Float64DType- Null values
- 0 (0.0%)
- Unique values
-
500 (100.0%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.498 ± 0.285
- Median ± IQR
- 0.495 ± 0.500
- Min | Max
- 0.000519 | 0.999
Feature 3
Float64DType- Null values
- 0 (0.0%)
- Unique values
-
500 (100.0%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.489 ± 0.292
- Median ± IQR
- 0.491 ± 0.529
- Min | Max
- 0.00123 | 0.999
Feature 4
Float64DType- Null values
- 0 (0.0%)
- Unique values
-
500 (100.0%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.502 ± 0.293
- Median ± IQR
- 0.498 ± 0.494
- Min | Max
- 0.00474 | 0.999
Target
Float64DType- Null values
- 0 (0.0%)
- Unique values
-
500 (100.0%)
This column has a high cardinality (> 40).
- Mean ± Std
- 0.00464 ± 0.519
- Median ± IQR
- 0.0352 ± 0.711
- Min | Max
- -1.16 | 1.21
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
|
Column
|
Column name
|
dtype
|
Is sorted
|
Null values
|
Unique values
|
Mean
|
Std
|
Min
|
Median
|
Max
|
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Feature 0 | Float64DType | False | 0 (0.0%) | 500 (100.0%) | 0.503 | 0.295 | 0.00107 | 0.505 | 1.00 |
| 1 | Feature 1 | Float64DType | False | 0 (0.0%) | 500 (100.0%) | 0.504 | 0.287 | 0.000568 | 0.489 | 0.994 |
| 2 | Feature 2 | Float64DType | False | 0 (0.0%) | 500 (100.0%) | 0.498 | 0.285 | 0.000519 | 0.495 | 0.999 |
| 3 | Feature 3 | Float64DType | False | 0 (0.0%) | 500 (100.0%) | 0.489 | 0.292 | 0.00123 | 0.491 | 0.999 |
| 4 | Feature 4 | Float64DType | False | 0 (0.0%) | 500 (100.0%) | 0.502 | 0.293 | 0.00474 | 0.498 | 0.999 |
| 5 | Target | Float64DType | False | 0 (0.0%) | 500 (100.0%) | 0.00464 | 0.519 | -1.16 | 0.0352 | 1.21 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
- [SKD002] Potential underfitting. Train/test scores are on par and not significantly better than the dummy baseline for 2/2 comparable metrics. Read our documentation for more details. Mute with
ignore=['SKD002'].
linear_report.metrics.summarize(data_source="both").frame()
| LinearRegression (train) | LinearRegression (test) | |
|---|---|---|
| Metric | ||
| R² | 0.001906 | -0.015818 |
| RMSE | 0.522214 | 0.504156 |
| Fit time (s) | 0.000950 | 0.000950 |
| Predict time (s) | 0.000136 | 0.000234 |
The linear model is flagged for underfitting: its scores are on par between train and test, and not significantly better than a dummy baseline.
- [SKD001] Potential overfitting. Significant train/test gaps were found for 2/2 default predictive metrics. Read our documentation for more details. Mute with
ignore=['SKD001'].
tree_report.metrics.summarize(data_source="both").frame()
| DecisionTreeRegressor (train) | DecisionTreeRegressor (test) | |
|---|---|---|
| Metric | ||
| R² | 1.000000 | 0.783887 |
| RMSE | 0.000000 | 0.232540 |
| Fit time (s) | 0.003311 | 0.003311 |
| Predict time (s) | 0.000340 | 0.000200 |
The deep tree is flagged for overfitting: it achieves a perfect score on train but degrades on test.
Ignoring specific checks#
Each diagnostic has a stable code (e.g. SKD001, SKD002). You can
mute individual checks per call:
tree_report.diagnose(ignore=["SKD001"])
- No issues were detected in your report!
Or globally, so that every subsequent diagnose() call
skips them:
import skore
with skore.configuration(ignore_diagnostics=["SKD001"]):
diagnosis = tree_report.diagnose()
diagnosis
- No issues were detected in your report!
Diagnostics on a CrossValidationReport#
When splitter is an integer, evaluate() returns a
CrossValidationReport. Diagnostics aggregate across folds.
- [SKD001] Potential overfitting. Detected in 5/5 evaluated splits. Read our documentation for more details. Mute with
ignore=['SKD001'].
Diagnostics on a ComparisonReport#
Passing a list of estimators returns a ComparisonReport.
Diagnostics are grouped by sub-report.
- [SKD002] Potential underfitting. [LinearRegression] Train/test scores are on par and not significantly better than the dummy baseline for 2/2 comparable metrics. Read our documentation for more details. Mute with
ignore=['SKD002']. - [SKD001] Potential overfitting. [DecisionTreeRegressor] Significant train/test gaps were found for 2/2 default predictive metrics. Read our documentation for more details. Mute with
ignore=['SKD001'].
Total running time of the script: (0 minutes 0.361 seconds)