On the tractability of SHAP explanations under Markovian distributions

Reda Marzouk,Colin de La Higuera

2024-05-05

Abstract:Thanks to its solid theoretical foundation, the SHAP framework is arguably one the most widely utilized frameworks for local explainability of ML models. Despite its popularity, its exact computation is known to be very challenging, proven to be NP-Hard in various configurations. Recent works have unveiled positive complexity results regarding the computation of the SHAP score for specific model families, encompassing decision trees, random forests, and some classes of boolean circuits. Yet, all these positive results hinge on the assumption of feature independence, often simplistic in real-world scenarios. In this article, we investigate the computational complexity of the SHAP score by relaxing this assumption and introducing a Markovian perspective. We show that, under the Markovian assumption, computing the SHAP score for the class of Weighted automata, Disjoint DNFs and Decision Trees can be performed in polynomial time, offering a first positive complexity result for the problem of SHAP score computation that transcends the limitations of the feature independence assumption.

Machine Learning,Artificial Intelligence

What problem does this paper attempt to address?

### Problems Addressed by the Paper The paper primarily explores the computational complexity of calculating SHAP values (SHapley Additive exPlanations) under the Markov distribution assumption. Specifically, the paper aims to demonstrate that under the Markov distribution assumption, the computation of SHAP values for models such as Weighted Automata (WAs), Disjoint DNFs, and Decision Trees can be completed in polynomial time. #### Main Contributions: 1. **SHAP Value Computation for Weighted Automata**: It is proven that under the assumption that the background data generation distribution is a Markov distribution, the computation of SHAP values for the class of Weighted Automata is solvable (Section 3). 2. **SHAP Value Computation for Disjoint DNFs and Decision Trees**: Under the same assumption, it is proven that the computation of SHAP values for the classes of Disjoint DNFs and Decision Trees is also solvable (Section 4). These results go beyond the limitations of existing literature that rely solely on the feature independence assumption, providing a new perspective on understanding the computational complexity of SHAP values in practical applications.

On the tractability of SHAP explanations under Markovian distributions

The Tractability of SHAP-Score-Based Explanations over Deterministic and Decomposable Boolean Circuits

Statistical Aspects of SHAP: Functional ANOVA for Model Interpretation

Shaping Up SHAP: Enhancing Stability through Layer-Wise Neighbor Selection

Local Interpretable Model Agnostic Shap Explanations for machine learning models

The Distributional Uncertainty of the SHAP score in Explainable Machine Learning

A novel approach to explain the black-box nature of machine learning in compressive strength predictions of concrete using Shapley additive explanations (SHAP)

Amortized SHAP values via sparse Fourier function approximation

Approximating Score-based Explanation Techniques Using Conformal Regression

An Imprecise SHAP as a Tool for Explaining the Class Probability Distributions under Limited Training Data

AI for Automating Data Center Operations: Model Explainability in the Data Centre Context Using Shapley Additive Explanations (SHAP)

TsSHAP: Robust model agnostic feature-based explainability for time series forecasting

Succint Interaction-Aware Explanations

Generating Counterfactual and Contrastive Explanations using SHAP

A Perspective on Explainable Artificial Intelligence Methods: SHAP and LIME

Ensembles of Random SHAPs

A $k$-additive Choquet integral-based approach to approximate the SHAP values for local interpretability in machine learning

Unified Explanations in Machine Learning Models: A Perturbation Approach

Interventional SHAP Values and Interaction Values for Piecewise Linear Regression Trees

Manifold-based Shapley explanations for high dimensional correlated features

Improving the Sampling Strategy in KernelSHAP