The skore API#

Skore has three types of reports: EstimatorReport (single train-test evaluation), CrossValidationReport (cross-validation), and ComparisonReport (comparing several estimators). All three are created via evaluate() by passing an estimator (or a list of estimators for comparison), the data X and y, and a splitter that controls the evaluation strategy.

This example showcases the unified API shared by these reports: they expose the same accessors (data, metrics, inspection). Methods that produce a visualization return a Display object with plot(), frame(), set_style(), and help().

Three report types, one API#

evaluate() returns one of three report types depending on its splitter argument: an EstimatorReport when splitter is a float or "prefit", a CrossValidationReport when splitter is an integer or a scikit-learn cross-validator (e.g. KFold, StratifiedKFold), or a ComparisonReport when passing a list of estimators. All three respect the same accessor layout where applicable:

  • data: dataset analysis

  • metrics: performance metrics and related displays (e.g. ROC, confusion matrix)

  • inspection: model inspection (e.g. coefficients, feature importance)

The data accessor is not available on ComparisonReport because compared models may use different input data; you can still inspect each underlying report. Methods on these accessors return Display objects with a common interface.

First report: single train-test split#

We call evaluate() with the default splitter=0.2 to get an EstimatorReport. The accessors and display API shown below are the same for the other report types.

from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from skore import evaluate
from skrub import tabular_pipeline

X, y = load_breast_cancer(return_X_y=True, as_frame=True)
estimator = tabular_pipeline(LogisticRegression())

report = evaluate(estimator, X, y, splitter=0.2)

Data accessor: report.data.analyze() returns a display#

The data accessor provides dataset summaries. Its analyze() method returns a TableReportDisplay.



Every display implements the same API. You can:

  • Plot it (with optional backend and style):

data_display.plot(kind="dist", x="mean radius", y="mean texture")
plot skore api

You can set the style of the plot via set_style() and then call plot():

data_display.set_style(scatterplot_kwargs={"color": "orange", "alpha": 1.0})
data_display.plot(kind="dist", x="mean radius", y="mean texture")
plot skore api
  • Export the underlying data as a DataFrame:

mean radius mean texture mean perimeter mean area mean smoothness mean compactness mean concavity mean concave points mean symmetry mean fractal dimension radius error texture error perimeter error area error smoothness error compactness error concavity error concave points error symmetry error fractal dimension error worst radius worst texture worst perimeter worst area worst smoothness worst compactness worst concavity worst concave points worst symmetry worst fractal dimension target
0 10.05 17.53 64.41 310.8 0.10070 0.07326 0.02511 0.01775 0.1890 0.06331 0.2619 2.0150 1.778 16.85 0.007803 0.01449 0.01690 0.008043 0.02100 0.002778 11.16 26.84 71.98 384.0 0.1402 0.14020 0.1055 0.06499 0.2894 0.07664 1
1 10.80 21.98 68.79 359.9 0.08801 0.05743 0.03614 0.01404 0.2016 0.05977 0.3077 1.6210 2.240 20.20 0.006543 0.02148 0.02991 0.010450 0.01844 0.002690 12.76 32.04 83.69 489.5 0.1303 0.16960 0.1927 0.07485 0.2965 0.07662 1
2 16.14 14.86 104.30 800.0 0.09495 0.08501 0.05500 0.04528 0.1735 0.05875 0.2387 0.6372 1.729 21.83 0.003958 0.01246 0.01831 0.008747 0.01500 0.001621 17.71 19.58 115.90 947.9 0.1206 0.17220 0.2310 0.11290 0.2778 0.07012 1
3 12.18 17.84 77.79 451.1 0.10450 0.07057 0.02490 0.02941 0.1900 0.06635 0.3661 1.5110 2.410 24.44 0.005433 0.01179 0.01131 0.015190 0.02220 0.003408 12.83 20.92 82.14 495.2 0.1140 0.09358 0.0498 0.05882 0.2227 0.07376 1
4 12.25 22.44 78.18 466.5 0.08192 0.05200 0.01714 0.01261 0.1544 0.05976 0.2239 1.1390 1.577 18.04 0.005096 0.01205 0.00941 0.004551 0.01608 0.002399 14.17 31.99 92.74 622.9 0.1256 0.18040 0.1230 0.06335 0.3100 0.08203 1
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
564 17.42 25.56 114.50 948.0 0.10060 0.11460 0.16820 0.06597 0.1308 0.05866 0.5296 1.6670 3.767 58.53 0.031130 0.08555 0.14380 0.039270 0.02175 0.012560 18.07 28.07 120.40 1021.0 0.1243 0.17930 0.2803 0.10990 0.1603 0.06818 0
565 12.75 16.70 82.51 493.8 0.11250 0.11170 0.03880 0.02995 0.2120 0.06623 0.3834 1.0030 2.495 28.62 0.007509 0.01561 0.01977 0.009199 0.01805 0.003629 14.45 21.74 93.63 624.1 0.1475 0.19790 0.1423 0.08045 0.3071 0.08557 1
566 20.18 19.54 133.80 1250.0 0.11330 0.14890 0.21330 0.12590 0.1724 0.06053 0.4331 1.0010 3.008 52.49 0.009087 0.02715 0.05546 0.019100 0.02451 0.004005 22.03 25.07 146.00 1479.0 0.1665 0.29420 0.5308 0.21730 0.3032 0.08075 0
567 18.31 20.58 120.80 1052.0 0.10680 0.12480 0.15690 0.09451 0.1860 0.05941 0.5449 0.9225 3.218 67.36 0.006176 0.01877 0.02913 0.010460 0.01559 0.002725 21.86 26.20 142.20 1493.0 0.1492 0.25360 0.3759 0.15100 0.3074 0.07863 0
568 15.04 16.74 98.73 689.4 0.09883 0.13640 0.07721 0.06142 0.1668 0.06869 0.3720 0.8423 2.304 34.84 0.004123 0.01819 0.01996 0.010040 0.01055 0.003237 16.76 20.43 109.70 856.9 0.1135 0.21760 0.1856 0.10180 0.2177 0.08549 1

569 rows × 31 columns



Metrics accessor: same idea, same display API#

The metrics accessor exposes methods such as confusion_matrix(), roc_curve(), precision_recall(), and prediction_error(). Each returns a display (e.g. ConfusionMatrixDisplay) with the same interface: plot(), frame(), set_style(), help().

metrics_display = report.metrics.confusion_matrix()
metrics_display.help()


true_label predicted_label value threshold split estimator data_source
268 0 0 44 0.417378 None LogisticRegression test
269 0 1 3 0.417378 None LogisticRegression test
270 1 0 2 0.417378 None LogisticRegression test
271 1 1 65 0.417378 None LogisticRegression test


Draw the confusion matrix by calling plot():

Confusion Matrix Decision threshold: 0.50 Data source: Test set

Inspection accessor#

The inspection accessor exposes model-specific displays (e.g. coefficients() for linear models, impurity_decrease() for trees). These also return Display objects with the same plot(), frame(), set_style(), and help() methods.

inspection_display = report.inspection.coefficients()
inspection_display.plot(select_k=15, sorting_order="descending")
Coefficients of LogisticRegression

Second report type: cross-validation#

Using the same evaluate() with an integer splitter returns a CrossValidationReport. The same accessors and display API apply; only the way the report was built changes.

cv_report = evaluate(estimator, X, y, splitter=3)

Again: data, metrics, and inspection return displays with plot(), frame(), and set_style().

cv_report.data.analyze().plot(kind="dist", x="mean radius", y="mean texture")
plot skore api
cv_report.metrics.confusion_matrix().plot()
Confusion Matrix Decision threshold: 0.50 Data source: Test set
/home/runner/work/skore/skore/skore/venv/lib/python3.13/site-packages/skore/_sklearn/_plot/metrics/confusion_matrix.py:604: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
  for _, group in self.confusion_matrix.groupby(["split"]):
cv_report.inspection.coefficients().plot(select_k=10, sorting_order="descending")
Coefficients of LogisticRegression

Third report type: comparison#

Passing a list of estimators to evaluate() returns a ComparisonReport. It exposes the same metrics and inspection accessors (no data accessor, since compared models can use different datasets). The display API is unchanged.

Summary#

  • Three report types (EstimatorReport, CrossValidationReport, ComparisonReport) are all created with evaluate() and share the same accessor layout: report.data, report.metrics, report.inspection (where applicable).

  • Accessor methods that produce figures or tables return Display objects.

  • Displays share a single, predictable API:

    • plot(**kwargs) — render the visualization

    • frame(**kwargs) — return the data as a pandas.DataFrame

    • set_style(policy=..., **kwargs) — customize appearance

    • help() — show available options

This consistency makes it easy to switch between report types and to reuse the same workflow across data, metrics, and inspection.

Total running time of the script: (0 minutes 8.301 seconds)

Gallery generated by Sphinx-Gallery