Note

Go to the end to download the full example code.

The skore API#

Skore has three types of reports: EstimatorReport (single train-test evaluation), CrossValidationReport (cross-validation), and ComparisonReport (comparing several estimators). All three are created via evaluate() by passing an estimator (or a list of estimators for comparison), the data X and y, and a splitter that controls the evaluation strategy.

This example showcases the unified API shared by these reports: they expose the same accessors (data, metrics, inspection). Methods that produce a visualization return a Display object with plot(), frame(), set_style(), and help().

Three report types, one API#

evaluate() returns one of three report types depending on its splitter argument: an EstimatorReport when splitter is a float or "prefit", a CrossValidationReport when splitter is an integer or a scikit-learn cross-validator (e.g. KFold, StratifiedKFold), or a ComparisonReport when passing a list of estimators. All three respect the same accessor layout where applicable:

data: dataset analysis
metrics: performance metrics and related displays (e.g. ROC, confusion matrix)
inspection: model inspection (e.g. coefficients, feature importance)

The data accessor is not available on ComparisonReport because compared models may use different input data; you can still inspect each underlying report. Methods on these accessors return Display objects with a common interface.

First report: single train-test split#

We call evaluate() with the default splitter=0.2 to get an EstimatorReport. The accessors and display API shown below are the same for the other report types.

from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from skore import evaluate
from skrub import tabular_pipeline

X, y = load_breast_cancer(return_X_y=True, as_frame=True)
estimator = tabular_pipeline(LogisticRegression())

report = evaluate(estimator, X, y, splitter=0.2)

Data accessor: `report.data.analyze()` returns a display#

The data accessor provides dataset summaries. Its analyze() method returns a TableReportDisplay.

data_display = report.data.analyze()
data_display.help()

Every display implements the same API. You can:

Plot it (with optional backend and style):

data_display.plot(kind="dist", x="mean radius", y="mean texture")

You can set the style of the plot via set_style() and then call plot():

data_display.set_style(scatterplot_kwargs={"color": "orange", "alpha": 1.0})
data_display.plot(kind="dist", x="mean radius", y="mean texture")

Export the underlying data as a DataFrame:

data_display.frame()

	mean radius	mean texture	mean perimeter	mean area	mean smoothness	mean compactness	mean concavity	mean concave points	mean symmetry	mean fractal dimension	radius error	texture error	perimeter error	area error	smoothness error	compactness error	concavity error	concave points error	symmetry error	fractal dimension error	worst radius	worst texture	worst perimeter	worst area	worst smoothness	worst compactness	worst concavity	worst concave points	worst symmetry	worst fractal dimension	target
0	10.05	17.53	64.41	310.8	0.10070	0.07326	0.02511	0.01775	0.1890	0.06331	0.2619	2.0150	1.778	16.85	0.007803	0.01449	0.01690	0.008043	0.02100	0.002778	11.16	26.84	71.98	384.0	0.1402	0.14020	0.1055	0.06499	0.2894	0.07664	1
1	10.80	21.98	68.79	359.9	0.08801	0.05743	0.03614	0.01404	0.2016	0.05977	0.3077	1.6210	2.240	20.20	0.006543	0.02148	0.02991	0.010450	0.01844	0.002690	12.76	32.04	83.69	489.5	0.1303	0.16960	0.1927	0.07485	0.2965	0.07662	1
2	16.14	14.86	104.30	800.0	0.09495	0.08501	0.05500	0.04528	0.1735	0.05875	0.2387	0.6372	1.729	21.83	0.003958	0.01246	0.01831	0.008747	0.01500	0.001621	17.71	19.58	115.90	947.9	0.1206	0.17220	0.2310	0.11290	0.2778	0.07012	1
3	12.18	17.84	77.79	451.1	0.10450	0.07057	0.02490	0.02941	0.1900	0.06635	0.3661	1.5110	2.410	24.44	0.005433	0.01179	0.01131	0.015190	0.02220	0.003408	12.83	20.92	82.14	495.2	0.1140	0.09358	0.0498	0.05882	0.2227	0.07376	1
4	12.25	22.44	78.18	466.5	0.08192	0.05200	0.01714	0.01261	0.1544	0.05976	0.2239	1.1390	1.577	18.04	0.005096	0.01205	0.00941	0.004551	0.01608	0.002399	14.17	31.99	92.74	622.9	0.1256	0.18040	0.1230	0.06335	0.3100	0.08203	1
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
564	17.42	25.56	114.50	948.0	0.10060	0.11460	0.16820	0.06597	0.1308	0.05866	0.5296	1.6670	3.767	58.53	0.031130	0.08555	0.14380	0.039270	0.02175	0.012560	18.07	28.07	120.40	1021.0	0.1243	0.17930	0.2803	0.10990	0.1603	0.06818	0
565	12.75	16.70	82.51	493.8	0.11250	0.11170	0.03880	0.02995	0.2120	0.06623	0.3834	1.0030	2.495	28.62	0.007509	0.01561	0.01977	0.009199	0.01805	0.003629	14.45	21.74	93.63	624.1	0.1475	0.19790	0.1423	0.08045	0.3071	0.08557	1
566	20.18	19.54	133.80	1250.0	0.11330	0.14890	0.21330	0.12590	0.1724	0.06053	0.4331	1.0010	3.008	52.49	0.009087	0.02715	0.05546	0.019100	0.02451	0.004005	22.03	25.07	146.00	1479.0	0.1665	0.29420	0.5308	0.21730	0.3032	0.08075	0
567	18.31	20.58	120.80	1052.0	0.10680	0.12480	0.15690	0.09451	0.1860	0.05941	0.5449	0.9225	3.218	67.36	0.006176	0.01877	0.02913	0.010460	0.01559	0.002725	21.86	26.20	142.20	1493.0	0.1492	0.25360	0.3759	0.15100	0.3074	0.07863	0
568	15.04	16.74	98.73	689.4	0.09883	0.13640	0.07721	0.06142	0.1668	0.06869	0.3720	0.8423	2.304	34.84	0.004123	0.01819	0.01996	0.010040	0.01055	0.003237	16.76	20.43	109.70	856.9	0.1135	0.21760	0.1856	0.10180	0.2177	0.08549	1

569 rows × 31 columns

Metrics accessor: same idea, same display API#

The metrics accessor exposes methods such as confusion_matrix(), roc_curve(), precision_recall(), and prediction_error(). Each returns a display (e.g. ConfusionMatrixDisplay) with the same interface: plot(), frame(), set_style(), help().

metrics_display = report.metrics.confusion_matrix()
metrics_display.help()

metrics_display.frame()

	true_label	predicted_label	value	threshold	split	estimator	data_source
268	0	0	44	0.417378	None	LogisticRegression	test
269	0	1	3	0.417378	None	LogisticRegression	test
270	1	0	2	0.417378	None	LogisticRegression	test
271	1	1	65	0.417378	None	LogisticRegression	test

Draw the confusion matrix by calling plot():

metrics_display.plot()

Confusion Matrix Decision threshold: 0.50 Data source: Test set

Inspection accessor#

The inspection accessor exposes model-specific displays (e.g. coefficients() for linear models, impurity_decrease() for trees). These also return Display objects with the same plot(), frame(), set_style(), and help() methods.

inspection_display = report.inspection.coefficients()
inspection_display.plot(select_k=15, sorting_order="descending")

Second report type: cross-validation#

Using the same evaluate() with an integer splitter returns a CrossValidationReport. The same accessors and display API apply; only the way the report was built changes.

cv_report = evaluate(estimator, X, y, splitter=3)

Again: data, metrics, and inspection return displays with plot(), frame(), and set_style().

cv_report.data.analyze().plot(kind="dist", x="mean radius", y="mean texture")

cv_report.metrics.confusion_matrix().plot()

/home/runner/work/skore/skore/skore/venv/lib/python3.13/site-packages/skore/_sklearn/_plot/metrics/confusion_matrix.py:604: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
  for _, group in self.confusion_matrix.groupby(["split"]):

cv_report.inspection.coefficients().plot(select_k=10, sorting_order="descending")

Third report type: comparison#

Passing a list of estimators to evaluate() returns a ComparisonReport. It exposes the same metrics and inspection accessors (no data accessor, since compared models can use different datasets). The display API is unchanged.

Summary#

Three report types (EstimatorReport, CrossValidationReport, ComparisonReport) are all created with evaluate() and share the same accessor layout: report.data, report.metrics, report.inspection (where applicable).
Accessor methods that produce figures or tables return Display objects.
Displays share a single, predictable API:
- plot(**kwargs) — render the visualization
- frame(**kwargs) — return the data as a pandas.DataFrame
- set_style(policy=..., **kwargs) — customize appearance
- help() — show available options

This consistency makes it easy to switch between report types and to reuse the same workflow across data, metrics, and inspection.

Total running time of the script: (0 minutes 8.301 seconds)

Gallery generated by Sphinx-Gallery