impurity_decrease#

ComparisonReport.inspection.impurity_decrease()[source]#

Retrieve the Mean Decrease in Impurity (MDI) for each report.

This method is available for estimators that expose a feature_importances_ attribute. See for example sklearn.ensemble.GradientBoostingClassifier.feature_importances_.

In particular, note that the MDI is computed at fit time, i.e. using the training data.

Comparison reports with the same features are put under one key and are plotted together. When some reports share the same features and others do not, those with the same features are plotted together.

Returns:

ImpurityDecreaseDisplay: The impurity decrease display containing the feature importances.

See also

ImpurityDecreaseDisplay: Display class for impurity decrease plots.

Examples

>>> from sklearn.datasets import load_iris
>>> from sklearn.ensemble import RandomForestClassifier
>>> from skore import train_test_split
>>> from skore import ComparisonReport, EstimatorReport
>>> X, y = load_iris(return_X_y=True, as_frame=True)
>>> split_data = train_test_split(X=X, y=y, shuffle=False, as_dict=True)
>>> report_small_trees = EstimatorReport(
...     RandomForestClassifier(max_depth=2, random_state=0), **split_data
... )
>>> report_big_trees = EstimatorReport(
...     RandomForestClassifier(random_state=0), **split_data
... )
>>> report = ComparisonReport({
...     "small trees": report_small_trees,
...     "big trees": report_big_trees,
... })
>>> display = report.inspection.impurity_decrease()
>>> display.frame()
     estimator            feature   importance
0  small trees  sepal length (cm)       0.1...
1  small trees   sepal width (cm)       0.0...
2  small trees  petal length (cm)       0.4...
3  small trees   petal width (cm)       0.4...
4    big trees  sepal length (cm)       0.0...
5    big trees   sepal width (cm)       0.0...
6    big trees  petal length (cm)       0.4...
7    big trees   petal width (cm)       0.4...
>>> display.plot() # shows plot