.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/technical_details/plot_custom_checks.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_technical_details_plot_custom_checks.py: .. _example_custom_checks: =============================== Adding custom diagnostic checks =============================== `skore` lets you extend the built-in diagnostic checks with your own. This example shows how to write a custom check function and register it with a report via :meth:`~skore.EstimatorReport.add_checks`. .. GENERATED FROM PYTHON SOURCE LINES 14-32 Writing a custom check for a single estimator ============================================= We start by defining a simple check that flags models with a very large number of features. The check inspects the test data attached to the report. We throw an exception when the test data is not available to avoid running the check when it is not applicable. The check function is wrapped in a :class:`~skore.Check` instance and registered with the report via :meth:`~skore.EstimatorReport.add_checks`. The `docs_url` argument is optional. When provided as a full URL (starting with ``"http"``), it is used as-is. When it is a plain anchor string it points to the skore diagnostic user guide. When omitted entirely, no documentation link is shown. We set the severity to "tip" to indicate that this is not an issue to fix, but a cautionary note about the dataset. Severity can also be set to "issue" to indicate when there is an issue to fix. .. GENERATED FROM PYTHON SOURCE LINES 32-57 .. code-block:: Python import numpy as np from skore import Check, CheckNotApplicable class CustomCheck1(Check): code = "CSTM001" title = "High feature count" report_type = "estimator" severity = "tip" docs_url = "https://scikit-learn.org/stable/modules/feature_selection.html#feature-selection" def check_function(self, report): """Flag when the number of features exceeds a threshold.""" if report.X_test is None: raise CheckNotApplicable() n_features = X.shape[1] if n_features > 50: return ( f"The dataset has {n_features} features which may hurt model performance. " "Consider feature selection or dimensionality reduction." ) return None .. GENERATED FROM PYTHON SOURCE LINES 58-67 Registering the check ===================== :meth:`~skore.EstimatorReport.add_checks` accepts a list of ``Check`` instances, and registers them. The next call to :meth:`~skore.EstimatorReport.diagnose` runs any newly added checks on top of the built-in checks. We can then find the new check in the Tips tab of the diagnostic, along another tip informing us that the dataset is not standardized. .. GENERATED FROM PYTHON SOURCE LINES 67-78 .. code-block:: Python from sklearn.linear_model import LinearRegression from skore import evaluate rng = np.random.default_rng(42) X = rng.normal(size=(200, 80)) y = X[:, 0] + rng.normal(size=200) report = evaluate(LinearRegression(), X, y) report.add_checks([CustomCheck1()]) report.diagnose() .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 79-92 Cross-validation level checks ============================= :class:`~skore.CrossValidationReport` and :class:`~skore.ComparisonReport` can also receive custom checks, either ran on the full report or on the component estimator reports. The `report_type` argument of :class:`~skore.Check` controls the scope of the check. Let's write a check that is specific to cross-validation reports: it flags metrics with high variance across splits. We set the severity to "issue" to indicate that this is an issue to fix. We will corrupt the first fold of the target to illustrate the check. .. GENERATED FROM PYTHON SOURCE LINES 92-128 .. code-block:: Python import pandas as pd y_noisy = y.copy() y_noisy[: len(y_noisy) // 5] = rng.normal(size=len(y_noisy) // 5) cv_report = evaluate(LinearRegression(), X, y_noisy, splitter=5) class CustomCheck2(Check): code = "CSTM002" title = "High score variance across CV splits" report_type = "cross-validation" docs_url = None severity = "issue" def check_function(self, report): """Flag high score variance across CV splits.""" frames = [ sub_report.metrics.summarize(data_source="test").data for sub_report in report.estimator_reports_ ] scores = pd.concat(frames, ignore_index=True) high_var_metrics = [ metric_name for metric_name, group in scores.groupby("metric_verbose_name") if group["score"].std() > 0.1 ] if high_var_metrics: return f"Metrics with high variance: {', '.join(high_var_metrics)}." return None cv_report.add_checks([CustomCheck2()]) cv_report.diagnose() .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 129-134 Aggregating checks across estimator reports =========================================== We can also reuse our first check to run it on the component estimator reports and aggregate the results across splits. .. GENERATED FROM PYTHON SOURCE LINES 134-138 .. code-block:: Python cv_report.add_checks([CustomCheck1()]) cv_report.diagnose() .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 139-141 Similarly, :class:`~skore.ComparisonReport` aggregates checks across its component reports. .. GENERATED FROM PYTHON SOURCE LINES 141-148 .. code-block:: Python from sklearn.ensemble import RandomForestRegressor comparison_report = evaluate( [LinearRegression(), RandomForestRegressor()], X, y, splitter=5 ) comparison_report.add_checks([CustomCheck1(), CustomCheck2()]) comparison_report.diagnose() .. raw:: html


.. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 3.841 seconds) .. _sphx_glr_download_auto_examples_technical_details_plot_custom_checks.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_custom_checks.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_custom_checks.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_custom_checks.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_