Note

Go to the end to download the full example code.

`EstimatorReport`: Get insights from any scikit-learn estimator#

This example shows how the skore.EstimatorReport class can be used to quickly get insights from any scikit-learn estimator.

Loading our dataset and defining our estimator#

First, we load a dataset from skrub. Our goal is to predict if a healthcare manufacturing companies paid a medical doctors or hospitals, in order to detect potential conflict of interest.

from skrub.datasets import fetch_open_payments

dataset = fetch_open_payments()
df = dataset.X
y = dataset.y

Downloading 'open_payments' from https://github.com/skrub-data/skrub-data-files/raw/refs/heads/main/open_payments.zip (attempt 1/3)

from skrub import TableReport

TableReport(df)

Please enable javascript

The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").

TableReport(y.to_frame())

Please enable javascript

The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").

Looking at the distributions of the target, we observe that this classification task is quite imbalanced. It means that we have to be careful when selecting a set of statistical metrics to evaluate the classification performance of our predictive model. In addition, we see that the class labels are not specified by an integer 0 or 1 but instead by a string “allowed” or “disallowed”.

For our application, the label of interest is “allowed”.

pos_label, neg_label = "allowed", "disallowed"

Before training a predictive model, we need to split our dataset into a training and a validation set.

from skore import train_test_split

# If you have many dataframes to split on, you can always ask train_test_split to return
# a dictionary. Remember, it needs to be passed as a keyword argument!
split_data = train_test_split(X=df, y=y, random_state=42, as_dict=True)

╭───────────────────────────── HighClassImbalanceWarning ──────────────────────────────╮
│ It seems that you have a classification problem with a high class imbalance. In this │
│ case, using train_test_split may not be a good idea because of high variability in   │
│ the scores obtained on the test set. To tackle this challenge we suggest to use      │
│ skore's CrossValidationReport with the `splitter` parameter of your choice.          │
╰──────────────────────────────────────────────────────────────────────────────────────╯
╭───────────────────────────────── ShuffleTrueWarning ─────────────────────────────────╮
│ We detected that the `shuffle` parameter is set to `True` either explicitly or from  │
│ its default value. In case of time-ordered events (even if they are independent),    │
│ this will result in inflated model performance evaluation because natural drift will │
│ not be taken into account. We recommend setting the shuffle parameter to `False` in  │
│ order to ensure the evaluation process is really representative of your production   │
│ release process.                                                                     │
╰──────────────────────────────────────────────────────────────────────────────────────╯

By the way, notice how skore’s train_test_split() automatically warns us for a class imbalance.

Now, we need to define a predictive model. Hopefully, skrub provides a convenient function (skrub.tabular_learner()) when it comes to getting strong baseline predictive models with a single line of code. As its feature engineering is generic, it does not provide some handcrafted and tailored feature engineering but still provides a good starting point.

So let’s create a classifier for our task.

from skrub import tabular_learner

estimator = tabular_learner("classifier")
estimator

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/skrub/_tabular_pipeline.py:75: FutureWarning:

tabular_learner will be deprecated in the next release. Equivalent functionality is available in skrub.tabular_pipeline.

Pipeline(steps=[('tablevectorizer',
                 TableVectorizer(low_cardinality=ToCategorical())),
                ('histgradientboostingclassifier',
                 HistGradientBoostingClassifier())])

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Pipeline

?Documentation for PipelineiNot fitted

Parameters

	steps	[('tablevectorizer', ...), ('histgradientboostingclassifier', ...)]
	transform_input	None
	memory	None
	verbose	False

tablevectorizer: TableVectorizer

Parameters

	cardinality_threshold	40
	low_cardinality	ToCategorical()
	high_cardinality	StringEncoder()
	numeric	PassThrough()
	datetime	DatetimeEncoder()
	specific_transformers	()
	drop_null_fraction	1.0
	drop_if_constant	False
	drop_if_unique	False
	datetime_format	None
	n_jobs	None

numeric

PassThrough

Parameters

datetime

DatetimeEncoder

Parameters

	resolution	'hour'
	add_weekday	False
	add_total_seconds	True
	add_day_of_year	False
	periodic_encoding	None

low_cardinality

ToCategorical

Parameters

high_cardinality

StringEncoder

Parameters

	n_components	30
	vectorizer	'tfidf'
	ngram_range	(3, ...)
	analyzer	'char_wb'
	stop_words	None
	random_state	None

HistGradientBoostingClassifier

?Documentation for HistGradientBoostingClassifier

Parameters

	loss	'log_loss'
	learning_rate	0.1
	max_iter	100
	max_leaf_nodes	31
	max_depth	None
	min_samples_leaf	20
	l2_regularization	0.0
	max_features	1.0
	max_bins	255
	categorical_features	'from_dtype'
	monotonic_cst	None
	interaction_cst	None
	warm_start	False
	early_stopping	'auto'
	scoring	'loss'
	validation_fraction	0.1
	n_iter_no_change	10
	tol	1e-07
	verbose	0
	random_state	None
	class_weight	None

Getting insights from our estimator#

Introducing the `skore.EstimatorReport` class#

Now, we would be interested in getting some insights from our predictive model. One way is to use the skore.EstimatorReport class. This constructor will detect that our estimator is unfitted and will fit it for us on the training data.

from skore import EstimatorReport

report = EstimatorReport(estimator, **split_data, pos_label=pos_label)

Once the report is created, we get some information regarding the available tools allowing us to get some insights from our specific model on our specific task by calling the help() method.

report.help()

╭───────────── Tools to diagnose estimator HistGradientBoostingClassifier ─────────────╮
│ EstimatorReport                                                                      │
│ ├── .metrics                                                                         │
│ │   ├── .accuracy(...)         (↗︎)     - Compute the accuracy score.                 │
│ │   ├── .brier_score(...)      (↘︎)     - Compute the Brier score.                    │
│ │   ├── .confusion_matrix(...)         - Plot the confusion matrix.                  │
│ │   ├── .log_loss(...)         (↘︎)     - Compute the log loss.                       │
│ │   ├── .precision(...)        (↗︎)     - Compute the precision score.                │
│ │   ├── .precision_recall(...)         - Plot the precision-recall curve.            │
│ │   ├── .recall(...)           (↗︎)     - Compute the recall score.                   │
│ │   ├── .roc(...)                      - Plot the ROC curve.                         │
│ │   ├── .roc_auc(...)          (↗︎)     - Compute the ROC AUC score.                  │
│ │   ├── .timings(...)                  - Get all measured processing times related   │
│ │   │   to the estimator.                                                            │
│ │   ├── .custom_metric(...)            - Compute a custom metric.                    │
│ │   └── .summarize(...)                - Report a set of metrics for our estimator.  │
│ ├── .feature_importance                                                              │
│ │   └── .permutation(...)              - Report the permutation feature importance.  │
│ ├── .data                                                                            │
│ │   └── .analyze(...)                  - Plot dataset statistics.                    │
│ ├── .cache_predictions(...)            - Cache estimator's predictions.              │
│ ├── .clear_cache(...)                  - Clear the cache.                            │
│ ├── .get_predictions(...)              - Get estimator's predictions.                │
│ └── Attributes                                                                       │
│     ├── .X_test                        - Testing data                                │
│     ├── .X_train                       - Training data                               │
│     ├── .y_test                        - Testing target                              │
│     ├── .y_train                       - Training target                             │
│     ├── .estimator                     - Estimator to make the report from           │
│     ├── .estimator_                    - The cloned or copied estimator              │
│     ├── .estimator_name_               - The name of the estimator                   │
│     ├── .fit                           - Whether to fit the estimator on the         │
│     │   training data                                                                │
│     ├── .fit_time_                     - The time taken to fit the estimator, in     │
│     │   seconds                                                                      │
│     ├── .ml_task                       - No description available                    │
│     └── .pos_label                     - For binary classification, the positive     │
│         class                                                                        │
│                                                                                      │
│                                                                                      │
│ Legend:                                                                              │
│ (↗︎) higher is better (↘︎) lower is better                                             │
╰──────────────────────────────────────────────────────────────────────────────────────╯

Be aware that we can access the help for each individual sub-accessor. For instance:

report.metrics.help()

╭───────────────────────────── Available metrics methods ──────────────────────────────╮
│ report.metrics                                                                       │
│ ├── .accuracy(...)         (↗︎)     - Compute the accuracy score.                     │
│ ├── .brier_score(...)      (↘︎)     - Compute the Brier score.                        │
│ ├── .confusion_matrix(...)         - Plot the confusion matrix.                      │
│ ├── .log_loss(...)         (↘︎)     - Compute the log loss.                           │
│ ├── .precision(...)        (↗︎)     - Compute the precision score.                    │
│ ├── .precision_recall(...)         - Plot the precision-recall curve.                │
│ ├── .recall(...)           (↗︎)     - Compute the recall score.                       │
│ ├── .roc(...)                      - Plot the ROC curve.                             │
│ ├── .roc_auc(...)          (↗︎)     - Compute the ROC AUC score.                      │
│ ├── .timings(...)                  - Get all measured processing times related to    │
│ │   the estimator.                                                                   │
│ ├── .custom_metric(...)            - Compute a custom metric.                        │
│ └── .summarize(...)                - Report a set of metrics for our estimator.      │
│                                                                                      │
│                                                                                      │
│ Legend:                                                                              │
│ (↗︎) higher is better (↘︎) lower is better                                             │
╰──────────────────────────────────────────────────────────────────────────────────────╯

Metrics computation with aggressive caching#

At this point, we might be interested to have a first look at the statistical performance of our model on the validation set that we provided. We can access it by calling any of the metrics displayed above. Since we are greedy, we want to get several metrics at once and we will use the summarize() method.

import time

start = time.time()
metric_report = report.metrics.summarize().frame()
end = time.time()
metric_report

	HistGradientBoostingClassifier
Metric
Precision	0.687101
Recall	0.462199
ROC AUC	0.943399
Brier score	0.035105
Fit time (s)	8.952221
Predict time (s)	1.570994

print(f"Time taken to compute the metrics: {end - start:.2f} seconds")

Time taken to compute the metrics: 4.91 seconds

An interesting feature provided by the skore.EstimatorReport is the the caching mechanism. Indeed, when we have a large enough dataset, computing the predictions for a model is not cheap anymore. For instance, on our smallish dataset, it took a couple of seconds to compute the metrics. The report will cache the predictions and if we are interested in computing a metric again or an alternative metric that requires the same predictions, it will be faster. Let’s check by requesting the same metrics report again.

start = time.time()
metric_report = report.metrics.summarize().frame()
end = time.time()
metric_report

	HistGradientBoostingClassifier
Metric
Precision	0.687101
Recall	0.462199
ROC AUC	0.943399
Brier score	0.035105
Fit time (s)	8.952221
Predict time (s)	1.570994

print(f"Time taken to compute the metrics: {end - start:.2f} seconds")

Time taken to compute the metrics: 0.00 seconds

Note that when the model is fitted or the predictions are computed, we additionally store the time the operation took:

report.metrics.timings()

{'fit_time': 8.952221436999935, 'predict_time_test': 1.570994144999986}

Since we obtain a pandas dataframe, we can also use the plotting interface of pandas.

import matplotlib.pyplot as plt

ax = metric_report.plot.barh()
ax.set_title("Metrics report")
plt.tight_layout()

Whenever computing a metric, we check if the predictions are available in the cache and reload them if available. So for instance, let’s compute the log loss.

start = time.time()
log_loss = report.metrics.log_loss()
end = time.time()
log_loss

0.1237788718893213

print(f"Time taken to compute the log loss: {end - start:.2f} seconds")

Time taken to compute the log loss: 0.03 seconds

We can show that without initial cache, it would have taken more time to compute the log loss.

report.clear_cache()

start = time.time()
log_loss = report.metrics.log_loss()
end = time.time()
log_loss

0.1237788718893213

print(f"Time taken to compute the log loss: {end - start:.2f} seconds")

Time taken to compute the log loss: 1.59 seconds

By default, the metrics are computed on the test set only. However, if a training set is provided, we can also compute the metrics by specifying the data_source parameter.

report.metrics.log_loss(data_source="train")

0.09700328649217936

In the case where we are interested in computing the metrics on a completely new set of data, we can use the data_source="X_y" parameter. In addition, we need to provide a X and y parameters.

start = time.time()
metric_report = report.metrics.summarize(
    data_source="X_y", X=split_data["X_test"], y=split_data["y_test"]
).frame()
end = time.time()
metric_report

	HistGradientBoostingClassifier
Metric
Precision	0.687101
Recall	0.462199
ROC AUC	0.943399
Brier score	0.035105
Fit time (s)	8.952221
Predict time (s)	1.557864

print(f"Time taken to compute the metrics: {end - start:.2f} seconds")

Time taken to compute the metrics: 5.04 seconds

As in the other case, we rely on the cache to avoid recomputing the predictions. Internally, we compute a hash of the input data to be sure that we can hit the cache in a consistent way.

start = time.time()
metric_report = report.metrics.summarize(
    data_source="X_y", X=split_data["X_test"], y=split_data["y_test"]
).frame()
end = time.time()
metric_report

	HistGradientBoostingClassifier
Metric
Precision	0.687101
Recall	0.462199
ROC AUC	0.943399
Brier score	0.035105
Fit time (s)	8.952221
Predict time (s)	1.557864

print(f"Time taken to compute the metrics: {end - start:.2f} seconds")

Time taken to compute the metrics: 0.19 seconds

Note

In this last example, we rely on computing the hash of the input data. Therefore, there is a trade-off: the computation of the hash is not free and it might be faster to compute the predictions instead.

Be aware that we can also benefit from the caching mechanism with our own custom metrics. Skore only expects that we define our own metric function to take y_true and y_pred as the first two positional arguments. It can take any other arguments. Let’s see an example.

def operational_decision_cost(y_true, y_pred, amount):
    mask_true_positive = (y_true == pos_label) & (y_pred == pos_label)
    mask_true_negative = (y_true == neg_label) & (y_pred == neg_label)
    mask_false_positive = (y_true == neg_label) & (y_pred == pos_label)
    mask_false_negative = (y_true == pos_label) & (y_pred == neg_label)
    fraudulent_refuse = mask_true_positive.sum() * 50
    fraudulent_accept = -amount[mask_false_negative].sum()
    legitimate_refuse = mask_false_positive.sum() * -5
    legitimate_accept = (amount[mask_true_negative] * 0.02).sum()
    return fraudulent_refuse + fraudulent_accept + legitimate_refuse + legitimate_accept

In our use case, we have a operational decision to make that translate the classification outcome into a cost. It translate the confusion matrix into a cost matrix based on some amount linked to each sample in the dataset that are provided to us. Here, we randomly generate some amount as an illustration.

import numpy as np

rng = np.random.default_rng(42)
amount = rng.integers(low=100, high=1000, size=len(split_data["y_test"]))

Let’s make sure that a function called the predict method and cached the result. We compute the accuracy metric to make sure that the predict method is called.

report.metrics.accuracy()

0.9526373028820011

We can now compute the cost of our operational decision.

start = time.time()
cost = report.metrics.custom_metric(
    metric_function=operational_decision_cost, response_method="predict", amount=amount
)
end = time.time()
cost

-131172.03999999998

print(f"Time taken to compute the cost: {end - start:.2f} seconds")

Time taken to compute the cost: 0.01 seconds

Let’s now clean the cache and see if it is faster.

report.clear_cache()

start = time.time()
cost = report.metrics.custom_metric(
    metric_function=operational_decision_cost, response_method="predict", amount=amount
)
end = time.time()
cost

-131172.03999999998

print(f"Time taken to compute the cost: {end - start:.2f} seconds")

Time taken to compute the cost: 1.57 seconds

We observe that caching is working as expected. It is really handy because it means that we can compute some additional metrics without having to recompute the the predictions.

report.metrics.summarize(
    scoring=["precision", "recall", operational_decision_cost],
    scoring_names=["Precision", "Recall", "Operational Decision Cost"],
    scoring_kwargs={"amount": amount, "response_method": "predict"},
).frame()

	HistGradientBoostingClassifier
Metric
Precision	0.687101
Recall	0.462199
Operational Decision Cost	-131172.040000

It could happen that we are interested in providing several custom metrics which does not necessarily share the same parameters. In this more complex case, skore will require us to provide a scorer using the sklearn.metrics.make_scorer() function.

from sklearn.metrics import make_scorer, f1_score

f1_scorer = make_scorer(f1_score, response_method="predict")
operational_decision_cost_scorer = make_scorer(
    operational_decision_cost, response_method="predict", amount=amount
)
report.metrics.summarize(
    scoring=[f1_scorer, operational_decision_cost_scorer],
    scoring_names=["F1 Score", "Operational Decision Cost"],
).frame()

	HistGradientBoostingClassifier
Metric
F1 Score	0.552645
Operational Decision Cost	-131172.040000

Effortless one-liner plotting#

The skore.EstimatorReport class also provides a plotting interface that allows to plot defacto the most common plots. As for the metrics, we only provide the meaningful set of plots for the provided estimator.

report.metrics.help()

╭───────────────────────────── Available metrics methods ──────────────────────────────╮
│ report.metrics                                                                       │
│ ├── .accuracy(...)         (↗︎)     - Compute the accuracy score.                     │
│ ├── .brier_score(...)      (↘︎)     - Compute the Brier score.                        │
│ ├── .confusion_matrix(...)         - Plot the confusion matrix.                      │
│ ├── .log_loss(...)         (↘︎)     - Compute the log loss.                           │
│ ├── .precision(...)        (↗︎)     - Compute the precision score.                    │
│ ├── .precision_recall(...)         - Plot the precision-recall curve.                │
│ ├── .recall(...)           (↗︎)     - Compute the recall score.                       │
│ ├── .roc(...)                      - Plot the ROC curve.                             │
│ ├── .roc_auc(...)          (↗︎)     - Compute the ROC AUC score.                      │
│ ├── .timings(...)                  - Get all measured processing times related to    │
│ │   the estimator.                                                                   │
│ ├── .custom_metric(...)            - Compute a custom metric.                        │
│ └── .summarize(...)                - Report a set of metrics for our estimator.      │
│                                                                                      │
│                                                                                      │
│ Legend:                                                                              │
│ (↗︎) higher is better (↘︎) lower is better                                             │
╰──────────────────────────────────────────────────────────────────────────────────────╯

Let’s start by plotting the ROC curve for our binary classification task.

display = report.metrics.roc()
display.plot()

ROC Curve for HistGradientBoostingClassifier

The plot functionality is built upon the scikit-learn display objects. We return those display (slightly modified to improve the UI) in case we want to tweak some of the plot properties. We can have quick look at the available attributes and methods by calling the help method or simply by printing the display.

display

skore.RocCurveDisplay(...)

display.help()

╭────────────────────────── RocCurveDisplay  ───────────────────────────╮
│ display                                                               │
│ ├──  Attributes                                                       │
│ │   ├── .ax_                                                          │
│ │   ├── .chance_level_                                                │
│ │   ├── .figure_                                                      │
│ │   └── .lines_                                                       │
│ └── Methods                                                           │
│     ├── .frame(...) - Get the data used to create the ROC curve plot. │
│     ├── .plot(...) - Plot visualization.                              │
│     └── .set_style(...) - Set the style parameters for the display.   │
╰───────────────────────────────────────────────────────────────────────╯

display.plot()
_ = display.ax_.set_title("Example of a ROC curve")

Similarly to the metrics, we aggressively use the caching to avoid recomputing the predictions of the model. We also cache the plot display object by detection if the input parameters are the same as the previous call. Let’s demonstrate the kind of performance gain we can get.

start = time.time()
# we already trigger the computation of the predictions in a previous call
display = report.metrics.roc()
display.plot()
end = time.time()

print(f"Time taken to compute the ROC curve: {end - start:.2f} seconds")

Time taken to compute the ROC curve: 0.01 seconds

Now, let’s clean the cache and check if we get a slowdown.

report.clear_cache()

start = time.time()
display = report.metrics.roc()
display.plot()
end = time.time()

print(f"Time taken to compute the ROC curve: {end - start:.2f} seconds")

Time taken to compute the ROC curve: 1.59 seconds

As expected, since we need to recompute the predictions, it takes more time.

Visualizing the confusion matrix#

Another useful visualization for classification tasks is the confusion matrix, which shows the counts of correct and incorrect predictions for each class.

Let’s first start with a basic confusion matrix:

cm_display = report.metrics.confusion_matrix()
cm_display.plot()
plt.show()

We can normalize the confusion matrix to get percentages instead of raw counts. Here we normalize by true labels (rows):

cm_display = report.metrics.confusion_matrix(normalize="true")
cm_display.plot()
plt.show()

More plotting options are available, check out the API on the confusion matrix for more information. We can customize the display labels:

cm_display = report.metrics.confusion_matrix(display_labels=["Disallowed", "Allowed"])
cm_display.plot()
plt.show()

Finally, the confusion matrix can also be exported as a pandas DataFrame for further analysis:

cm_frame = cm_display.frame()
cm_frame

	Disallowed	Allowed
Disallowed	538	626
Allowed	245	16981

	Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_Name	Dispute_Status_for_Publication	Name_of_Associated_Covered_Device_or_Medical_Supply1	Name_of_Associated_Covered_Drug_or_Biological1	Physician_Specialty
		Dispute_Status_for_Publication	Name_of_Associated_Covered_Device_or_Medical_Supply1	Name_of_Associated_Covered_Drug_or_Biological1	Physician_Specialty
0	ELI LILLY AND COMPANY	No			Allopathic & Osteopathic Physicians\|Pediatrics\|Pediatric Rheumatology
1	ELI LILLY AND COMPANY	No			Allopathic & Osteopathic Physicians\|Internal Medicine\|Nephrology
2	ELI LILLY AND COMPANY	No			Allopathic & Osteopathic Physicians\|Internal Medicine\|Rheumatology
3	ELI LILLY AND COMPANY	No			Allopathic & Osteopathic Physicians\|Internal Medicine\|Endocrinology, Diabetes & Metabolism
4	ELI LILLY AND COMPANY	No		EFFIENT	Allopathic & Osteopathic Physicians\|Pediatrics\|Pediatric Hematology-Oncology

73,553	GlaxoSmithKline, LLC.	No		ZIAGEN
73,554	ALERE SCARBOROUGH, INC.	No	Alere PBP2a
73,555	NovoCure Limited	No
73,556	Wright Medical Technology, Inc.	No		HIPS
73,557	Alcon Research Ltd	No		Express

Column	Column name	dtype	Is sorted	Null values	Unique values
0	Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_Name	ObjectDType	False	0 (0.0%)	1466 (2.0%)
1	Dispute_Status_for_Publication	ObjectDType	False	0 (0.0%)	2 (< 0.1%)
2	Name_of_Associated_Covered_Device_or_Medical_Supply1	ObjectDType	False	43088 (58.6%)	4372 (5.9%)
3	Name_of_Associated_Covered_Drug_or_Biological1	ObjectDType	False	36233 (49.3%)	2262 (3.1%)
4	Physician_Specialty	ObjectDType	False	3996 (5.4%)	513 (0.7%)

Column 1	Column 2	Cramér's V
Name_of_Associated_Covered_Device_or_Medical_Supply1	Name_of_Associated_Covered_Drug_or_Biological1	0.385
Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_Name	Name_of_Associated_Covered_Drug_or_Biological1	0.205
Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_Name	Name_of_Associated_Covered_Device_or_Medical_Supply1	0.169
Dispute_Status_for_Publication	Name_of_Associated_Covered_Device_or_Medical_Supply1	0.0981
Name_of_Associated_Covered_Device_or_Medical_Supply1	Physician_Specialty	0.0929
Dispute_Status_for_Publication	Physician_Specialty	0.0829
Name_of_Associated_Covered_Drug_or_Biological1	Physician_Specialty	0.0752
Dispute_Status_for_Publication	Name_of_Associated_Covered_Drug_or_Biological1	0.0713
Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_Name	Physician_Specialty	0.0564
Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_Name	Dispute_Status_for_Publication	0.0454

`EstimatorReport`: Get insights from any scikit-learn estimator#

Loading our dataset and defining our estimator#

Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_Name

Dispute_Status_for_Publication

Name_of_Associated_Covered_Device_or_Medical_Supply1

Name_of_Associated_Covered_Drug_or_Biological1

Physician_Specialty

Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_Name

Dispute_Status_for_Publication

Name_of_Associated_Covered_Device_or_Medical_Supply1

Name_of_Associated_Covered_Drug_or_Biological1

Physician_Specialty

Please enable javascript

status

status

Please enable javascript

Getting insights from our estimator#

Introducing the `skore.EstimatorReport` class#

Metrics computation with aggressive caching#

Effortless one-liner plotting#

Visualizing the confusion matrix#

This Page

	status
	status
0	disallowed
1	disallowed
2	disallowed
3	disallowed
4	disallowed

73,553	allowed
73,554	allowed
73,555	allowed
73,556	allowed
73,557	allowed

EstimatorReport: Get insights from any scikit-learn estimator#

Loading our dataset and defining our estimator#

Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_Name

Dispute_Status_for_Publication

Name_of_Associated_Covered_Device_or_Medical_Supply1

Name_of_Associated_Covered_Drug_or_Biological1

Physician_Specialty

Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_Name

Dispute_Status_for_Publication

Name_of_Associated_Covered_Device_or_Medical_Supply1

Name_of_Associated_Covered_Drug_or_Biological1

Physician_Specialty

Please enable javascript

status

status

Please enable javascript

Getting insights from our estimator#

Introducing the skore.EstimatorReport class#

Metrics computation with aggressive caching#

Effortless one-liner plotting#

Visualizing the confusion matrix#

This Page

`EstimatorReport`: Get insights from any scikit-learn estimator#

Introducing the `skore.EstimatorReport` class#