The skore API#

Skore has three types of reports: EstimatorReport (single train-test evaluation), CrossValidationReport (cross-validation), and ComparisonReport (comparing several estimators). All three are created via evaluate() by passing an estimator (or a list or dict of named estimators for comparison), the data X and y, and a splitter that controls the evaluation strategy.

This example showcases the unified API shared by these reports: they expose the same accessors (data, metrics, inspection). Methods that produce a visualization return a Display object with plot(), frame(), set_style(), and help().

Three report types, one API#

evaluate() returns one of three report types depending on its splitter argument: an EstimatorReport when splitter is a float or "prefit", a CrossValidationReport when splitter is an integer or a scikit-learn cross-validator (e.g. KFold, StratifiedKFold), or a ComparisonReport when passing a list or dict of estimators. All three respect the same accessor layout where applicable:

  • data: dataset analysis

  • metrics: performance metrics and related displays (e.g. ROC, confusion matrix)

  • inspection: model inspection (e.g. coefficients, feature importance)

The data accessor is not available on ComparisonReport because compared models may use different input data; you can still inspect each underlying report. Methods on these accessors return Display objects with a common interface.

First report: single train-test split#

We call evaluate() with the default splitter=0.2 to get an EstimatorReport. The accessors and display API shown below are the same for the other report types.

from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from skore import evaluate
from skrub import tabular_pipeline

X, y = load_breast_cancer(return_X_y=True, as_frame=True)
estimator = tabular_pipeline(LogisticRegression())

report = evaluate(estimator, X, y, splitter=0.2)

Data accessor: report.data.analyze() returns a display#

The data accessor provides dataset summaries. Its analyze() method returns a TableReportDisplay.



Every display implements the same API. You can:

  • Plot it (with optional backend and style):

data_display.plot(kind="dist", x="mean radius", y="mean texture")
plot skore api
<Figure size 640x480 with 1 Axes>

You can set the style of the plot via set_style() and then call plot():

data_display.set_style(scatterplot_kwargs={"color": "orange", "alpha": 1.0})
data_display.plot(kind="dist", x="mean radius", y="mean texture")
plot skore api
<Figure size 640x480 with 1 Axes>
  • Export the underlying data as a DataFrame:

mean radius mean texture mean perimeter mean area mean smoothness mean compactness mean concavity mean concave points mean symmetry mean fractal dimension radius error texture error perimeter error area error smoothness error compactness error concavity error concave points error symmetry error fractal dimension error worst radius worst texture worst perimeter worst area worst smoothness worst compactness worst concavity worst concave points worst symmetry worst fractal dimension target
0 10.05 17.53 64.41 310.8 0.10070 0.07326 0.02511 0.01775 0.1890 0.06331 0.2619 2.0150 1.778 16.85 0.007803 0.01449 0.01690 0.008043 0.02100 0.002778 11.16 26.84 71.98 384.0 0.1402 0.14020 0.1055 0.06499 0.2894 0.07664 1
1 10.80 21.98 68.79 359.9 0.08801 0.05743 0.03614 0.01404 0.2016 0.05977 0.3077 1.6210 2.240 20.20 0.006543 0.02148 0.02991 0.010450 0.01844 0.002690 12.76 32.04 83.69 489.5 0.1303 0.16960 0.1927 0.07485 0.2965 0.07662 1
2 16.14 14.86 104.30 800.0 0.09495 0.08501 0.05500 0.04528 0.1735 0.05875 0.2387 0.6372 1.729 21.83 0.003958 0.01246 0.01831 0.008747 0.01500 0.001621 17.71 19.58 115.90 947.9 0.1206 0.17220 0.2310 0.11290 0.2778 0.07012 1
3 12.18 17.84 77.79 451.1 0.10450 0.07057 0.02490 0.02941 0.1900 0.06635 0.3661 1.5110 2.410 24.44 0.005433 0.01179 0.01131 0.015190 0.02220 0.003408 12.83 20.92 82.14 495.2 0.1140 0.09358 0.0498 0.05882 0.2227 0.07376 1
4 12.25 22.44 78.18 466.5 0.08192 0.05200 0.01714 0.01261 0.1544 0.05976 0.2239 1.1390 1.577 18.04 0.005096 0.01205 0.00941 0.004551 0.01608 0.002399 14.17 31.99 92.74 622.9 0.1256 0.18040 0.1230 0.06335 0.3100 0.08203 1
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
564 17.42 25.56 114.50 948.0 0.10060 0.11460 0.16820 0.06597 0.1308 0.05866 0.5296 1.6670 3.767 58.53 0.031130 0.08555 0.14380 0.039270 0.02175 0.012560 18.07 28.07 120.40 1021.0 0.1243 0.17930 0.2803 0.10990 0.1603 0.06818 0
565 12.75 16.70 82.51 493.8 0.11250 0.11170 0.03880 0.02995 0.2120 0.06623 0.3834 1.0030 2.495 28.62 0.007509 0.01561 0.01977 0.009199 0.01805 0.003629 14.45 21.74 93.63 624.1 0.1475 0.19790 0.1423 0.08045 0.3071 0.08557 1
566 20.18 19.54 133.80 1250.0 0.11330 0.14890 0.21330 0.12590 0.1724 0.06053 0.4331 1.0010 3.008 52.49 0.009087 0.02715 0.05546 0.019100 0.02451 0.004005 22.03 25.07 146.00 1479.0 0.1665 0.29420 0.5308 0.21730 0.3032 0.08075 0
567 18.31 20.58 120.80 1052.0 0.10680 0.12480 0.15690 0.09451 0.1860 0.05941 0.5449 0.9225 3.218 67.36 0.006176 0.01877 0.02913 0.010460 0.01559 0.002725 21.86 26.20 142.20 1493.0 0.1492 0.25360 0.3759 0.15100 0.3074 0.07863 0
568 15.04 16.74 98.73 689.4 0.09883 0.13640 0.07721 0.06142 0.1668 0.06869 0.3720 0.8423 2.304 34.84 0.004123 0.01819 0.01996 0.010040 0.01055 0.003237 16.76 20.43 109.70 856.9 0.1135 0.21760 0.1856 0.10180 0.2177 0.08549 1

569 rows × 31 columns



Metrics accessor: same idea, same display API#

The metrics accessor exposes methods such as confusion_matrix(), roc_curve(), precision_recall(), and prediction_error(). Each returns a display (e.g. ConfusionMatrixDisplay) with the same interface: plot(), frame(), set_style(), help().

metrics_display = report.metrics.confusion_matrix()
metrics_display.help()


true_label predicted_label value split estimator data_source
0 0 0 45 NaN LogisticRegression test
1 0 1 2 NaN LogisticRegression test
2 1 0 2 NaN LogisticRegression test
3 1 1 65 NaN LogisticRegression test


Draw the confusion matrix by calling plot():

Confusion Matrix Data source: Test set

Inspection accessor#

The inspection accessor exposes model-specific displays (e.g. coefficients() for linear models, impurity_decrease() for trees). These also return Display objects with the same plot(), frame(), set_style(), and help() methods.

inspection_display = report.inspection.coefficients()
_ = inspection_display.plot(select_k=15, sorting_order="descending")
Coefficients of LogisticRegression

Second report type: cross-validation#

Using the same evaluate() with an integer splitter returns a CrossValidationReport. The same accessors and display API apply; only the way the report was built changes.

cv_report = evaluate(estimator, X, y, splitter=3)

Again: data, metrics, and inspection return displays with plot(), frame(), and set_style().

cv_report.data.analyze().plot(kind="dist", x="mean radius", y="mean texture")
plot skore api
<Figure size 640x480 with 1 Axes>
_ = cv_report.metrics.confusion_matrix().plot()
Confusion Matrix Data source: Test set
_ = cv_report.inspection.coefficients().plot(select_k=10, sorting_order="descending")
Coefficients of LogisticRegression

Third report type: comparison#

Passing a list or dict of estimators to evaluate() returns a ComparisonReport. It exposes the same metrics and inspection accessors (no data accessor, since compared models can use different datasets). The display API is unchanged.

Summary#

  • Three report types (EstimatorReport, CrossValidationReport, ComparisonReport) are all created with evaluate() and share the same accessor layout: report.data, report.metrics, report.inspection (where applicable).

  • Accessor methods that produce figures or tables return Display objects.

  • Displays share a single, predictable API:

    • plot(**kwargs) — render the visualization

    • frame(**kwargs) — return the data as a pandas.DataFrame

    • set_style(policy=..., **kwargs) — customize appearance

    • help() — show available options

This consistency makes it easy to switch between report types and to reuse the same workflow across data, metrics, and inspection.

Total running time of the script: (0 minutes 9.397 seconds)

Gallery generated by Sphinx-Gallery