Example: Feature importance and SHAP values¶
Motivation¶
As stated in the example on LIME explainability, LIME is a local explainability method. This means that LIME focuses on one sample -- or its locality/vicinity/neighborhood -- at a time and tries to imitate the true model behaviour around that sample with a simpler model.
On the other hand, SHAP -- short for SHapley Additive exPlanations -- is another popular method for interpretability which can provide both local and global interpretations. It uses a cooperative game theory approach to estimate the impact of each feature on the prediction. Local interpretation in SHAP refers to the explanation for a specific prediction made for an individual instance in your dataset. In addition, global interpretation can be achieved in a holistic way by analyzing the importance of each feature across the entire dataset.
Lastly, feature importance computation is another widely-used method for global interpretability. While it may not be applicable to all types of models, most tree-based models—including decision trees, random forests, and gradient boosting machines—come with built-in mechanisms for calculating feature importance. This approach ranks the features based on their contributions to the model, providing an overarching view of what features predominantly drive the model's decision-making process.
While SHAP and feature importance are typically used in supervised learning scenarios, the key
motivation of better understanding a model's behaviour applies just as well to CATE
estimation. However, in the context of MetaLearners when compared to the traditional supervised setting,
there's a notable challenge. Some MetaLearners, such as the SLearner,
TLearner and XLearner, do not possess a direct
CATE prediction model. Conversely, others like the RLearner and
DRLearner have a dedicated second stage model which directly estimates the CATE.
Given these varied scenarios, we provide examples for both situations in the subsequent sections.
Therefore, we illustrate how it can be used with the MetaLearners from metalearners.
Background¶
Feature importance¶
Most tree-based estimators compute feature importances while fitting, see
here and
the importance_type attribute from LGBMRegressor
or XGBRegressor.
These feature importance measures present valuable insights into the impact of each feature on the CATE estimation.
It is important to note that there are different methods of computing the feature importances,
for example LGBMRegressor
allows to pass importance_type="split" or importance_type="gain". In the former
feature importances are computed as the number of times the feature is used to split a tree
and in the latter the total gain of the splits which use the feature is used. This can affect
the results and depending on the use case one may be better than the other.
SHAP values¶
SHAP is a method developed by Lundberg et al. (2017). SHAP values base their approach on the Shapley value, a concept derived from cooperative game theory. This concept assigns a payout to each player contributing to a cooperative game, depending on their contribution to the total payout. Translated into the realm of machine learning, a game is the prediction task, players are the features, and the payout becomes the prediction.
A positive factor of the SHAP values is their desirable properties which come from their theoretical foundation in cooperative game theory. These properties include:
- Local accuracy: The sum of all Shapley values always adds up to the model's prediction.
- Missingness: If a feature is set to zero, its Shapley value also becomes zero.
- Consistency: If a model changes so that the marginal contribution of a feature value increases or stays the same (regardless of other features), its Shapley value doesn't decrease.
Much like the Shapley values that inspired them, computing SHAP values has a high computational cost. Each possible subset of features needs to be considered to calculate exact SHAP values, which leads to an exponential increase in computation with the number of features. However, techniques like Kernel SHAP and Tree SHAP have been developed, which can approximate SHAP values efficiently even over thousands of features.
Usage¶
Loading the data¶
Just like in our example on estimating CATEs with a MetaLearner, we will first load some experiment data:
import pandas as pd
from pathlib import Path
from git_root import git_root
df = pd.read_csv(git_root("data/learning_mindset.zip"))
outcome_column = "achievement_score"
treatment_column = "intervention"
feature_columns = [
column
for column in df.columns
if column not in [outcome_column, treatment_column]
]
categorical_feature_columns = [
"ethnicity",
"gender",
"frst_in_family",
"school_urbanicity",
"schoolid",
]
# Note that explicitly setting the dtype of these features to category
# allows both lightgbm as well as shap plots to
# 1. Operate on features which are not of type int, bool or float
# 2. Correctly interpret categoricals with int values to be
# interpreted as categoricals, as compared to ordinals/numericals.
for categorical_feature_column in categorical_feature_columns:
df[categorical_feature_column] = df[categorical_feature_column].astype(
"category"
)
Now that we've loaded the experiment data, we can train a couple of MetaLearners.
Training MetaLearners¶
Again, mirroring our example on estimating CATEs with a MetaLearner,
we can train an RLearner as follows:
from metalearners import RLearner
from lightgbm import LGBMRegressor, LGBMClassifier
rlearner = RLearner(
nuisance_model_factory=LGBMRegressor,
propensity_model_factory=LGBMClassifier,
treatment_model_factory=LGBMRegressor,
nuisance_model_params={"verbose": -1},
propensity_model_params={"verbose": -1},
treatment_model_params={"verbose": -1},
is_classification=False,
n_variants=2,
)
rlearner.fit(
X=df[feature_columns],
y=df[outcome_column],
w=df[treatment_column],
)
<metalearners.rlearner.RLearner at 0x124525890>
We will also train a TLearner to show both scenarios:
from metalearners import TLearner
tlearner = TLearner(
nuisance_model_factory=LGBMRegressor,
is_classification=False,
n_variants=2,
nuisance_model_params={"verbose": -1},
)
tlearner.fit(
X=df[feature_columns],
y=df[outcome_column],
w=df[treatment_column],
)
<metalearners.tlearner.TLearner at 0x30f632750>
Creating an Explainer object¶
Now we can need to create an Explainer object which will
allow us to compute the feature importance and the SHAP values.
This step is the key difference between the models which have a unique treatment model for each variant and the ones that do not.
The MetaLearner has a unique treatment model for each variant¶
In our trained RLearner the treatment model is a LGBMRegressor, this computes feature
importances and can be used for SHAP values calculations, therefore we can directly use
it as the model to interpret. We can get an Explainer
which uses the final stage models with the following code:
rlearner_explainer = rlearner.explainer()
The MetaLearner does not have a unique treatment model for each variant¶
In the scenario where the MetaLearner does not have a unique treatment model for each
variant, the proposed solution is to fit an interpretable model for each treatment variant.
The input for these models will be the same covariates, while the output is the estimated CATE
by the MetaLearner. For this we can use the same method as before MetaLearner.explainer but passing also the X,
cate_estimates and cate_model_factory parameters:
tlearner_explainer = tlearner.explainer(
X=df[feature_columns],
cate_estimates=tlearner.predict(X=df[feature_columns], is_oos=False),
cate_model_factory=LGBMRegressor,
cate_model_params={"verbose": -1},
)
It is important to note that this method works for every MetaLearner as we can always retrain
new models to interpret, for example with the previously trained RLearner:
rlearner_explainer_from_estimates = rlearner.explainer(
X=df[feature_columns],
cate_estimates=rlearner.predict(X=df[feature_columns], is_oos=False),
cate_model_factory=LGBMRegressor,
cate_model_params={"verbose": -1},
)
Computing the feature importance¶
Now we can compute the feature importances:
print(rlearner_explainer.feature_importances(feature_names=feature_columns, sort_values=True)[0], "\n")
print(tlearner_explainer.feature_importances(feature_names=feature_columns, sort_values=True)[0], "\n")
print(rlearner_explainer_from_estimates.feature_importances(feature_names=feature_columns, sort_values=True)[0], "\n")
success_expect 600
schoolid 494
ethnicity 311
school_size 254
gender 246
school_ethnic_minority 226
school_poverty 224
school_mindset 221
school_achievement 211
frst_in_family 210
school_urbanicity 3
dtype: int32
schoolid 846
success_expect 481
ethnicity 409
school_ethnic_minority 212
school_mindset 201
gender 199
frst_in_family 191
school_size 163
school_achievement 148
school_poverty 138
school_urbanicity 12
dtype: int32
schoolid 858
ethnicity 495
success_expect 427
school_mindset 214
gender 191
school_ethnic_minority 175
frst_in_family 169
school_poverty 161
school_achievement 153
school_size 152
school_urbanicity 5
dtype: int32
Note that the method feature_importances
returns a list of length $n_{variats} - 1$ that indicates the feature importance for
each variant against control. Remember that a higher value means that the corresponding
feature is more important for the CATE prediction.
Computing and plotting the SHAP values¶
We can compute the SHAP values with the corresponding calls to shap_values with the desired shap_explainer_factory, in our case as we are are always interpreting
LGBMRegressor models we will use TreeExplainer:
from shap import TreeExplainer, summary_plot
shap_values_rlearner = rlearner_explainer.shap_values(
X=df[feature_columns], shap_explainer_factory=TreeExplainer
)
summary_plot(shap_values_rlearner[0], features=df[feature_columns])
shap_values_tlearner = tlearner_explainer.shap_values(
X=df[feature_columns], shap_explainer_factory=TreeExplainer
)
summary_plot(shap_values_tlearner[0], features=df[feature_columns])
shap_values_rlearner_from_estimates = rlearner_explainer_from_estimates.shap_values(
X=df[feature_columns], shap_explainer_factory=TreeExplainer
)
summary_plot(shap_values_rlearner_from_estimates[0], features=df[feature_columns])
In these SHAP summary plots, the color and orientation of the plotted values help us to understand their impact on model predictions.
Each dot in the plot represents a single instance of the given feature present in the data set. The x-axis displays the Shapley value, signifying the strength and directionality of the feature's impact. The y-axis displays a subset of the features in the model.
The Shapley value, exhibited on the horizontal axis, is oriented such that values on the right of the center line (0 mark) contribute to a positive shift in the predicted outcome, while those on the left indicate a negative impact.
The color coding implemented in these plots is straightforward: red implies a high feature value, while blue denotes a low feature value. This color scheme assists in identifying whether high or low values of a certain feature influence the model's output positively or negatively. The categorical variables are colored in grey.
For more guidelines on how to interpret such SHAP plots please see the SHAP documentation.
Note that the method shap_values returns a list of length $n_{variats} - 1$ that indicates the SHAP values for
each variant against control.
Further comments¶
- In the scenario where the model has a unique treatment model for each variant and the used
base model does not compute
feature_importances_, theExplainerobject will raise an error when callingfeature_importances. In that case, it is required to retrain a model which computesfeature_importances_by passingX,cate_estimatesand a compatiblecate_model_factorytoMetaLearner.explainer.