Note
Go to the end to download the full example code.
Local skore Project#
This example shows how to use Project in local mode: store
reports on your machine and inspect them. A key point is that
summarize() returns a Summary,
which is a pandas.DataFrame. In Jupyter you get an interactive widget, but
you can always inspect and filter the summary as a DataFrame if you prefer.
Create a local project and store reports#
We use a temporary directory as the workspace so the example is self-contained.
In practice you can omit workspace to use the default (e.g. a skore/
directory in your user cache).
from pathlib import Path
from tempfile import TemporaryDirectory
from skore import Project
tmp_dir = TemporaryDirectory()
tmp_path = Path(tmp_dir.name)
project = Project("example-project", workspace=tmp_path)
from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from skore import train_test_split
from skrub import tabular_pipeline
X, y = load_breast_cancer(return_X_y=True, as_frame=True)
split_data = train_test_split(X=X, y=y, random_state=42, as_dict=True)
estimator = tabular_pipeline(LogisticRegression(max_iter=1_000))
â•────────────────────── HighClassImbalanceTooFewExamplesWarning ───────────────────────╮
│ It seems that you have a classification problem with at least one class with fewer │
│ than 100 examples in the test set. In this case, using train_test_split may not be a │
│ good idea because of high variability in the scores obtained on the test set. We │
│ suggest three options to tackle this challenge: you can increase test_size, collect │
│ more data, or use skore's CrossValidationReport with the `splitter` parameter of │
│ your choice. │
╰──────────────────────────────────────────────────────────────────────────────────────╯
â•───────────────────────────────── ShuffleTrueWarning ─────────────────────────────────╮
│ We detected that the `shuffle` parameter is set to `True` either explicitly or from │
│ its default value. In case of time-ordered events (even if they are independent), │
│ this will result in inflated model performance evaluation because natural drift will │
│ not be taken into account. We recommend setting the shuffle parameter to `False` in │
│ order to ensure the evaluation process is really representative of your production │
│ release process. │
╰──────────────────────────────────────────────────────────────────────────────────────╯
import numpy as np
from sklearn.base import clone
from skore import EstimatorReport
for regularization in np.logspace(-7, 7, 31):
report = EstimatorReport(
clone(estimator).set_params(logisticregression__C=regularization),
**split_data,
pos_label=1,
)
project.put(f"lr-regularization-{regularization:.1e}", report)
Summarize: you get a DataFrame#
summarize() returns a Summary,
which subclasses pandas.DataFrame. In a Jupyter environment it renders
an interactive parallel-coordinates widget by default.
summary = project.summarize()
To see the normal DataFrame table instead of the widget (e.g. in scripts or
when you prefer the table), wrap the summary in pandas.DataFrame:
import pandas as pd
pandas_summary = pd.DataFrame(summary)
pandas_summary
Basically, our summary contains metadata related to various information that we need to quickly help filtering the reports.
<class 'skore._project._summary.Summary'>
MultiIndex: 31 entries, (0, '1b92268a9df1ed70aebdece3fe4ff701') to (30, '2cab9dea8c0f89dc68dfa3ace1b27d6c')
Data columns (total 16 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 key 31 non-null object
1 date 31 non-null object
2 learner 31 non-null category
3 ml_task 31 non-null object
4 report_type 31 non-null object
5 dataset 31 non-null object
6 rmse 0 non-null object
7 log_loss 31 non-null float64
8 roc_auc 31 non-null float64
9 fit_time 31 non-null float64
10 predict_time 31 non-null float64
11 rmse_mean 0 non-null object
12 log_loss_mean 0 non-null object
13 roc_auc_mean 0 non-null object
14 fit_time_mean 0 non-null object
15 predict_time_mean 0 non-null object
dtypes: category(1), float64(4), object(11)
memory usage: 5.3+ KB
Filter reports by metric (e.g. keep only those above a given accuracy) and work with the result as a table.
summary.query("log_loss < 0.1")["key"].tolist()
['lr-regularization-1.2e-01', 'lr-regularization-3.4e-01', 'lr-regularization-1.0e+00', 'lr-regularization-2.9e+00', 'lr-regularization-8.6e+00', 'lr-regularization-2.5e+01']
Use reports() to load the corresponding
reports from the project (optionally after filtering the summary).
reports = summary.query("log_loss < 0.1").reports(return_as="comparison")
len(reports.reports_)
6
Since we got a ComparisonReport, we can use the metrics accessor
to summarize the metrics across the reports.
reports.metrics.summarize().frame()
reports.metrics.roc().plot(subplot_by=None)

project.delete("example-project", workspace=tmp_path)
tmp_dir.cleanup()
Total running time of the script: (0 minutes 6.566 seconds)