EstimatorReport.metrics.add: Adapt skore to your use-case#

By default, summarize() reports a curated set of metrics for your ML task. In practice you often need domain-specific scores: a business cost function, a custom fairness measure, an F-beta with a particular beta, etc.

This example walks through how to register such metrics with add() so they are computed and displayed alongside the built-in ones.

Setting up a classification problem#

import skore
from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression

X, y = load_breast_cancer(return_X_y=True)

We create an EstimatorReport through evaluate() using a simple train/test split. pos_label=1 marks the malignant class as the positive class.

report = skore.evaluate(
    LogisticRegression(max_iter=10_000), X, y, pos_label=1, splitter=0.2
)

Let’s look at the default metrics:

report.metrics.summarize().frame()
LogisticRegression
Metric
Accuracy 0.947368
Precision 0.984127
Recall 0.925373
ROC AUC 0.993649
Log loss 0.110103
Brier score 0.036139
Fit time (s) 0.199774
Predict time (s) 0.000118


Adding a plain callable#

Any function with the signature (y_true, y_pred, **kwargs) -> float can be registered with add(). The function name is used as the metric name by default.

def specificity(y_true, y_pred):
    """Proportion of true negatives among actual negatives."""
    tn = ((y_true == 0) & (y_pred == 0)).sum()
    fp = ((y_true == 0) & (y_pred == 1)).sum()
    return tn / (tn + fp)


report.metrics.add(specificity, greater_is_better=True)
report.metrics.summarize().frame()
LogisticRegression
Metric
Accuracy 0.947368
Precision 0.984127
Recall 0.925373
ROC AUC 0.993649
Log loss 0.110103
Brier score 0.036139
Fit time (s) 0.199774
Predict time (s) 0.000118
Specificity 0.978723


specificity now appears alongside the built-in metrics.

Passing extra keyword arguments#

If your metric needs extra data at scoring time (e.g. sample-level amounts, a cost matrix, …), pass them as keyword arguments to add(). They will be forwarded to the metric function when it is computed.

def misclassification_cost(y_true, y_pred, cost_fp, cost_fn):
    """Total cost of misclassifications, weighted by error type."""
    fp = ((y_true == 0) & (y_pred == 1)).sum()
    fn = ((y_true == 1) & (y_pred == 0)).sum()
    return cost_fp * fp + cost_fn * fn


report.metrics.add(
    misclassification_cost,
    greater_is_better=False,
    cost_fp=1.0,
    cost_fn=10.0,
)

report.metrics.summarize().frame()
LogisticRegression
Metric
Accuracy 0.947368
Precision 0.984127
Recall 0.925373
ROC AUC 0.993649
Log loss 0.110103
Brier score 0.036139
Fit time (s) 0.199774
Predict time (s) 0.000118
Specificity 0.978723
Misclassification Cost 51.000000


Adding an sklearn scorer#

If you already have a make_scorer() object, you can register it directly. The response_method and greater_is_better metadata are extracted from the scorer automatically.

from sklearn.metrics import fbeta_score, make_scorer

f2_scorer = make_scorer(fbeta_score, beta=2, response_method="predict", pos_label=1)
report.metrics.add(f2_scorer, name="f2")

report.metrics.summarize().frame()
LogisticRegression
Metric
Accuracy 0.947368
Precision 0.984127
Recall 0.925373
ROC AUC 0.993649
Log loss 0.110103
Brier score 0.036139
Fit time (s) 0.199774
Predict time (s) 0.000118
Specificity 0.978723
Misclassification Cost 51.000000
F2 0.936556


Cherry-picking metrics to display#

Once registered, custom metrics can be selected by name in summarize():

report.metrics.summarize(
    metric=["specificity", "f2", "misclassification_cost"],
).frame()
LogisticRegression
Metric
Specificity 0.978723
F2 0.936556
Misclassification Cost 51.000000


Selecting data_source="both" lets you compare train vs. test in one call:

report.metrics.summarize(metric=["specificity", "f2"], data_source="both").frame()
LogisticRegression (train) LogisticRegression (test)
Metric
Specificity 0.933333 0.978723
F2 0.975945 0.936556


Using a different response method#

By default, callables receive the output of estimator.predict(X). If your metric needs probabilities instead, set response_method="predict_proba".

import numpy as np


def mean_confidence(y_true, y_proba):
    """Average predicted probability assigned to the true class."""
    return np.where(y_true == 1, y_proba, 1 - y_proba).mean()


report.metrics.add(
    mean_confidence, response_method="predict_proba", greater_is_better=True
)

report.metrics.summarize(metric="mean_confidence").frame()
LogisticRegression
Metric
Mean Confidence 0.931102


Total running time of the script: (0 minutes 0.245 seconds)

Gallery generated by Sphinx-Gallery