Variational Shapley Network: A Probabilistic Approach to Self-Explaining Shapley values with Uncertainty Quantification

Mert Ketenci,Iñigo Urteaga,Victor Alfonso Rodriguez,Noémie Elhadad,Adler Perotte
2024-02-07
Abstract:Shapley values have emerged as a foundational tool in machine learning (ML) for elucidating model decision-making processes. Despite their widespread adoption and unique ability to satisfy essential explainability axioms, computational challenges persist in their estimation when ($i$) evaluating a model over all possible subset of input feature combinations, ($ii$) estimating model marginals, and ($iii$) addressing variability in explanations. We introduce a novel, self-explaining method that simplifies the computation of Shapley values significantly, requiring only a single forward pass. Recognizing the deterministic treatment of Shapley values as a limitation, we explore incorporating a probabilistic framework to capture the inherent uncertainty in explanations. Unlike alternatives, our technique does not rely directly on the observed data space to estimate marginals; instead, it uses adaptable baseline values derived from a latent, feature-specific embedding space, generated by a novel masked neural network architecture. Evaluations on simulated and real datasets underscore our technique's robust predictive and explanatory performance.
Machine Learning
What problem does this paper attempt to address?
This paper attempts to address three main challenges faced when using Shapley values in machine - learning model interpretation: 1. **Computing all possible feature subset combinations**: The calculation of Shapley values requires evaluating the performance of the model on all possible feature subsets, which is a huge computational burden in high - dimensional data. 2. **Estimating the model marginal distribution**: The definition of Shapley values requires estimating the marginal distribution of each feature subset, but in practical applications, these marginal distributions are often difficult to obtain accurately. 3. **Uncertainty in interpretation**: Traditional Shapley values are usually regarded as deterministic, ignoring the inherent uncertainty in the model output, which may lead to inaccurate or unreliable interpretations of the model's decisions. To address these challenges, the paper proposes a new method - the Variational Shapley Network (VSN). The main contributions of VSN include: 1. **Modeling Shapley values as random variables**: The paper defines Stochastic Shapley Values (SSVs) and represents them as a conditional probability distribution \( \phi_j \sim p(\phi_j | x) \). In this way, the uncertainty in Shapley values can be captured. 2. **Self - explanatory SSV inference**: The paper proposes a variational objective function for learning the generation parameters of SSVs. Through this method, the prior distribution of SSVs can be guided around the Expected Shapley Value (ESV) of each feature. 3. **Interpreting the uncertainty in the model output**: The paper introduces a conditional likelihood function to capture the uncertainty in the model output and generate an uncertainty estimate for SSVs. 4. **Efficiently handling variable - length inputs**: The paper proposes a masked neural network architecture that can efficiently process and marginalize variable - length continuous data. In addition, the baseline values in the learned embedding space are used to replace the baseline values in the input space to improve the accuracy of marginalization. 5. **Accurate Shapley value estimation and uncertainty quantification**: The paper shows on synthetic and real - world datasets that this method can learn centered SSV distributions and provide useful uncertainty quantification of the random effects observed in the data. Through these contributions, VSN not only simplifies the calculation process of Shapley values but also provides a more comprehensive method for interpreting model decisions while considering the uncertainty in the model output.