.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/technical_details/plot_skore_hub_project.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_technical_details_plot_skore_hub_project.py: .. _example_skore_hub_project: ================= Hub skore Project ================= This example shows how to use :class:`~skore.Project` in **hub** mode: store reports remotely and inspect them. A key point is that :meth:`~skore.Project.summarize` returns a :class:`~skore.project._summary.Summary`, which is a :class:`pandas.DataFrame`. In Jupyter you get an interactive widget, but you can always inspect and filter the summary as a DataFrame if you prefer. Examples -------- To run this example and push in your own Skore Hub workspace and project, you can run this example with the following command: .. code-block:: bash WORKSPACE= PROJECT= python plot_skore_hub_project.py In this gallery, we are going to push the different reports into a public workspace. .. GENERATED FROM PYTHON SOURCE LINES 27-34 .. code-block:: Python from skore import login login() .. rst-class:: sphx-glr-script-out .. code-block:: none ╭───────────────────────────────── Login to Skore Hub ─────────────────────────────────╮ │ │ │ Successfully logged in, using API key. │ │ │ ╰──────────────────────────────────────────────────────────────────────────────────────╯ .. GENERATED FROM PYTHON SOURCE LINES 69-78 .. code-block:: Python from sklearn.datasets import load_breast_cancer from sklearn.linear_model import LogisticRegression from skore import train_test_split from skrub import tabular_pipeline X, y = load_breast_cancer(return_X_y=True, as_frame=True) split_data = train_test_split(X=X, y=y, random_state=42, as_dict=True) estimator = tabular_pipeline(LogisticRegression(max_iter=1_000)) .. rst-class:: sphx-glr-script-out .. code-block:: none ╭────────────────────── HighClassImbalanceTooFewExamplesWarning ───────────────────────╮ │ It seems that you have a classification problem with at least one class with fewer │ │ than 100 examples in the test set. In this case, using train_test_split may not be a │ │ good idea because of high variability in the scores obtained on the test set. We │ │ suggest three options to tackle this challenge: you can increase test_size, collect │ │ more data, or use skore's CrossValidationReport with the `splitter` parameter of │ │ your choice. │ ╰──────────────────────────────────────────────────────────────────────────────────────╯ ╭───────────────────────────────── ShuffleTrueWarning ─────────────────────────────────╮ │ We detected that the `shuffle` parameter is set to `True` either explicitly or from │ │ its default value. In case of time-ordered events (even if they are independent), │ │ this will result in inflated model performance evaluation because natural drift will │ │ not be taken into account. We recommend setting the shuffle parameter to `False` in │ │ order to ensure the evaluation process is really representative of your production │ │ release process. │ ╰──────────────────────────────────────────────────────────────────────────────────────╯ .. GENERATED FROM PYTHON SOURCE LINES 79-95 .. code-block:: Python from numpy import logspace from sklearn.base import clone from skore import EstimatorReport, Project project = Project(f"{WORKSPACE}/{PROJECT}", mode="hub") for regularization in logspace(-3, 3, 5): project.put( f"lr-regularization-{regularization:.1e}", EstimatorReport( clone(estimator).set_params(logisticregression__C=regularization), **split_data, pos_label=1, ), ) .. rst-class:: sphx-glr-script-out .. code-block:: none Putting lr-regularization-1.0e-03 0:00:38 Consult your report at https://skore.probabl.ai/skore/example-skore-hub-project-dev/estimators/1448 Putting lr-regularization-3.2e-02 0:00:36 Consult your report at https://skore.probabl.ai/skore/example-skore-hub-project-dev/estimators/1449 Putting lr-regularization-1.0e+00 0:00:37 Consult your report at https://skore.probabl.ai/skore/example-skore-hub-project-dev/estimators/1450 Putting lr-regularization-3.2e+01 0:00:35 Consult your report at https://skore.probabl.ai/skore/example-skore-hub-project-dev/estimators/1451 Putting lr-regularization-1.0e+03 0:00:36 Consult your report at https://skore.probabl.ai/skore/example-skore-hub-project-dev/estimators/1452 .. GENERATED FROM PYTHON SOURCE LINES 96-102 Summarize: you get a DataFrame ============================== :meth:`~skore.Project.summarize` returns a :class:`~skore.project._summary.Summary`, which subclasses :class:`pandas.DataFrame`. In a Jupyter environment it renders an interactive parallel-coordinates widget by default. .. GENERATED FROM PYTHON SOURCE LINES 102-104 .. code-block:: Python summary = project.summarize() .. GENERATED FROM PYTHON SOURCE LINES 105-107 To see the normal DataFrame table instead of the widget (e.g. in scripts or when you prefer the table), wrap the summary in :class:`pandas.DataFrame`: .. GENERATED FROM PYTHON SOURCE LINES 107-112 .. code-block:: Python import pandas as pd pandas_summary = pd.DataFrame(summary) pandas_summary .. raw:: html
key date learner ml_task report_type dataset rmse log_loss roc_auc fit_time predict_time rmse_mean log_loss_mean roc_auc_mean fit_time_mean predict_time_mean
id
0 skore:report:estimator:1448 lr-regularization-1.0e-03 2026-03-12T17:29:05.051852+00:00 LogisticRegression binary-classification estimator 0966e6e4b6a8c8bd5b0e6bd95f36939d None 0.388992 0.995214 0.134774 0.080897 None None None None None
1 skore:report:estimator:1449 lr-regularization-3.2e-02 2026-03-12T17:29:42.214998+00:00 LogisticRegression binary-classification estimator 0966e6e4b6a8c8bd5b0e6bd95f36939d None 0.114416 0.998752 0.287037 0.044496 None None None None None
2 skore:report:estimator:1450 lr-regularization-1.0e+00 2026-03-12T17:30:19.698017+00:00 LogisticRegression binary-classification estimator 0966e6e4b6a8c8bd5b0e6bd95f36939d None 0.054072 0.997919 0.081626 0.052578 None None None None None
3 skore:report:estimator:1451 lr-regularization-3.2e+01 2026-03-12T17:30:55.592403+00:00 LogisticRegression binary-classification estimator 0966e6e4b6a8c8bd5b0e6bd95f36939d None 0.084849 0.996255 0.074412 0.042013 None None None None None
4 skore:report:estimator:1452 lr-regularization-1.0e+03 2026-03-12T17:31:31.819065+00:00 LogisticRegression binary-classification estimator 0966e6e4b6a8c8bd5b0e6bd95f36939d None 0.584540 0.988140 0.077690 0.040308 None None None None None


.. GENERATED FROM PYTHON SOURCE LINES 113-115 Basically, our summary contains metadata related to various information that we need to quickly help filtering the reports. .. GENERATED FROM PYTHON SOURCE LINES 115-117 .. code-block:: Python summary.info() .. rst-class:: sphx-glr-script-out .. code-block:: none MultiIndex: 5 entries, (0, 'skore:report:estimator:1448') to (4, 'skore:report:estimator:1452') Data columns (total 16 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 key 5 non-null object 1 date 5 non-null object 2 learner 5 non-null category 3 ml_task 5 non-null object 4 report_type 5 non-null object 5 dataset 5 non-null object 6 rmse 0 non-null object 7 log_loss 5 non-null float64 8 roc_auc 5 non-null float64 9 fit_time 5 non-null float64 10 predict_time 5 non-null float64 11 rmse_mean 0 non-null object 12 log_loss_mean 0 non-null object 13 roc_auc_mean 0 non-null object 14 fit_time_mean 0 non-null object 15 predict_time_mean 0 non-null object dtypes: category(1), float64(4), object(11) memory usage: 1.1+ KB .. GENERATED FROM PYTHON SOURCE LINES 118-120 Filter reports by metric (e.g. keep only those above a given accuracy) and work with the result as a table. .. GENERATED FROM PYTHON SOURCE LINES 120-122 .. code-block:: Python summary.query("log_loss < 0.1")["key"].tolist() .. rst-class:: sphx-glr-script-out .. code-block:: none ['lr-regularization-1.0e+00', 'lr-regularization-3.2e+01'] .. GENERATED FROM PYTHON SOURCE LINES 123-125 Use :meth:`~skore.project._summary.Summary.reports` to load the corresponding reports from the project (optionally after filtering the summary). .. GENERATED FROM PYTHON SOURCE LINES 125-128 .. code-block:: Python reports = summary.query("log_loss < 0.1").reports(return_as="comparison") len(reports.reports_) .. rst-class:: sphx-glr-script-out .. code-block:: none 2 .. GENERATED FROM PYTHON SOURCE LINES 129-131 Since we got a :class:`~skore.ComparisonReport`, we can use the metrics accessor to summarize the metrics across the reports. .. GENERATED FROM PYTHON SOURCE LINES 131-133 .. code-block:: Python reports.metrics.summarize().frame() .. raw:: html
Estimator LogisticRegression_1 LogisticRegression_2
Metric
Accuracy 0.993007 0.965035
Precision 0.988889 0.988372
Recall 1.000000 0.955056
ROC AUC 0.997919 0.996255
Brier score 0.013810 0.023881
Fit time (s) 0.081626 0.074412
Predict time (s) 0.040488 0.039703


.. GENERATED FROM PYTHON SOURCE LINES 134-135 .. code-block:: Python reports.metrics.roc().plot(subplot_by=None) .. image-sg:: /auto_examples/technical_details/images/sphx_glr_plot_skore_hub_project_001.png :alt: ROC Curve Positive label: 1 Data source: Test set :srcset: /auto_examples/technical_details/images/sphx_glr_plot_skore_hub_project_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** (3 minutes 14.090 seconds) .. _sphx_glr_download_auto_examples_technical_details_plot_skore_hub_project.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_skore_hub_project.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_skore_hub_project.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_skore_hub_project.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_