.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/integrations/plot_sklearn_api.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_integrations_plot_sklearn_api.py: .. _example_sklearn_api: =================================================== Using skore with scikit-learn compatible estimators =================================================== This example shows how to use skore with scikit-learn compatible estimators. Any model that can be used with the scikit-learn API can be used with skore. Use :func:`~skore.evaluate` to create a report from any estimator that has a ``fit`` and ``predict`` method (or only ``predict`` if already fitted). .. note:: When computing the ROC AUC or ROC curve for a classification task, the estimator must have a ``predict_proba`` method. In this example, we showcase a gradient boosting model (`XGBoost `_) and a custom estimator. Note that this example is not exhaustive; many other scikit-learn compatible models can be used with skore: - More gradient boosting libraries like `LightGBM `_, and `CatBoost `_, - Deep learning frameworks such as `Keras `_ and `skorch `_ (a wrapper for `PyTorch `_). - Tabular foundation models such as `TabICL `_ and `TabPFN `_, - etc. .. GENERATED FROM PYTHON SOURCE LINES 42-47 Generate a classification dataset ================================= To illustrate the compatibility with scikit-learn estimators, we first generate a synthetic binary classification dataset with only 1,000 samples. .. GENERATED FROM PYTHON SOURCE LINES 49-57 .. code-block:: Python import pandas as pd import skrub from sklearn.datasets import make_classification X, y = make_classification(n_samples=1_000, random_state=42) X = pd.DataFrame(X, columns=[f"Feature_{i}" for i in range(X.shape[1])]) skrub.TableReport(X) .. raw:: html

Please enable javascript

The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").



.. GENERATED FROM PYTHON SOURCE LINES 58-69 Gradient-boosted decision trees with XGBoost ============================================ While `skore` is designed to be fully compatible with classifiers and regressors from the scikit-learn library, it is also compatible with any classifier or regressor that follows the scikit-learn API as defined in the `scikit-learn documentation `_. Here, we showcase a gradient-boosted decision trees model from the `XGBoost `_ library that follows exactly this paradigm. .. GENERATED FROM PYTHON SOURCE LINES 71-79 .. code-block:: Python from skore import evaluate from xgboost import XGBClassifier xgb = XGBClassifier(n_estimators=50, max_depth=3, learning_rate=0.1, random_state=42) xgb_report = evaluate(xgb, X, y, splitter=0.2, pos_label=1) xgb_report .. raw:: html
XGBClassifier(base_score=None, booster=None, callbacks=None,
                  colsample_bylevel=None, colsample_bynode=None,
                  colsample_bytree=None, device=None, early_stopping_rounds=None,
                  enable_categorical=False, eval_metric=None, feature_types=None,
                  feature_weights=None, gamma=None, grow_policy=None,
                  importance_type=None, interaction_constraints=None,
                  learning_rate=0.1, max_bin=None, max_cat_threshold=None,
                  max_cat_to_onehot=None, max_delta_step=None, max_depth=3,
                  max_leaves=None, min_child_weight=None, missing=nan,
                  monotone_constraints=None, multi_strategy=None, n_estimators=50,
                  n_jobs=None, num_parallel_tree=None, ...)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Please enable javascript

The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").

0 issue(s), 0 tip(s), 3 passed, 0 ignored.


.. GENERATED FROM PYTHON SOURCE LINES 80-82 We see that we get the same report as when using a scikit-learn classifier and we can access the different elements. .. GENERATED FROM PYTHON SOURCE LINES 84-86 .. code-block:: Python xgb_report.metrics.summarize().frame() .. raw:: html
XGBClassifier
Metric
Accuracy 0.900000
Precision 0.989899
Recall 0.837607
ROC AUC 0.980126
Log loss 0.218888
Brier score 0.064364
Fit time (s) 0.039216
Predict time (s) 0.001271


.. GENERATED FROM PYTHON SOURCE LINES 87-88 We can easily get the summary of metrics, and also a ROC curve plot for example: .. GENERATED FROM PYTHON SOURCE LINES 90-92 .. code-block:: Python _ = xgb_report.metrics.roc().plot() .. image-sg:: /auto_examples/integrations/images/sphx_glr_plot_sklearn_api_001.png :alt: ROC Curve for XGBClassifier Positive label: 1 Data source: Test set :srcset: /auto_examples/integrations/images/sphx_glr_plot_sklearn_api_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 93-94 We can also inspect our model: .. GENERATED FROM PYTHON SOURCE LINES 96-98 .. code-block:: Python _ = xgb_report.inspection.permutation_importance().plot() .. image-sg:: /auto_examples/integrations/images/sphx_glr_plot_sklearn_api_002.png :alt: Permutation importance of XGBClassifier on test set :srcset: /auto_examples/integrations/images/sphx_glr_plot_sklearn_api_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 99-106 Custom model ============ Now, we showcase how one could create a scikit-learn custom estimator that follows the requirements of scikit-learn. Here, we create a nearest neighbor classifier: .. GENERATED FROM PYTHON SOURCE LINES 108-133 .. code-block:: Python import numpy as np from sklearn.base import BaseEstimator, ClassifierMixin from sklearn.metrics import euclidean_distances from sklearn.utils.multiclass import unique_labels from sklearn.utils.validation import check_is_fitted, validate_data class CustomClassifier(ClassifierMixin, BaseEstimator): def __init__(self): pass def fit(self, X, y): X, y = validate_data(self, X, y) self.classes_ = unique_labels(y) self.X_ = X self.y_ = y return self def predict(self, X): check_is_fitted(self) X = validate_data(self, X, reset=False) closest = np.argmin(euclidean_distances(X, self.X_), axis=1) return self.y_[closest] .. GENERATED FROM PYTHON SOURCE LINES 134-137 .. code-block:: Python custom_report = evaluate(CustomClassifier(), X, y, splitter=0.2) custom_report .. raw:: html
CustomClassifier()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Please enable javascript

The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").

1 issue(s), 0 tip(s), 2 passed, 0 ignored.


.. GENERATED FROM PYTHON SOURCE LINES 138-153 Conclusion ========== This example demonstrates how skore can be used with scikit-learn compatible estimators. This allows practitioners to use consistent reporting and visualization tools across different estimators. .. seealso:: For an example of wrapping Large Language Models (LLMs) to be compatible with scikit-learn APIs, see the tutorial on `Quantifying LLMs Uncertainty with Conformal Predictions `_. The article demonstrates how to wrap models like Mistral-7B-Instruct in a scikit-learn-compatible interface. .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 2.915 seconds) .. _sphx_glr_download_auto_examples_integrations_plot_sklearn_api.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_sklearn_api.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_sklearn_api.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_sklearn_api.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_