On the tractability of SHAP explanations under Markovian distributions

Reda Marzouk,Colin de La Higuera
2024-05-05
Abstract:Thanks to its solid theoretical foundation, the SHAP framework is arguably one the most widely utilized frameworks for local explainability of ML models. Despite its popularity, its exact computation is known to be very challenging, proven to be NP-Hard in various configurations. Recent works have unveiled positive complexity results regarding the computation of the SHAP score for specific model families, encompassing decision trees, random forests, and some classes of boolean circuits. Yet, all these positive results hinge on the assumption of feature independence, often simplistic in real-world scenarios. In this article, we investigate the computational complexity of the SHAP score by relaxing this assumption and introducing a Markovian perspective. We show that, under the Markovian assumption, computing the SHAP score for the class of Weighted automata, Disjoint DNFs and Decision Trees can be performed in polynomial time, offering a first positive complexity result for the problem of SHAP score computation that transcends the limitations of the feature independence assumption.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper primarily explores the computational complexity of calculating SHAP values (SHapley Additive exPlanations) under the Markov distribution assumption. Specifically, the paper aims to demonstrate that under the Markov distribution assumption, the computation of SHAP values for models such as Weighted Automata (WAs), Disjoint DNFs, and Decision Trees can be completed in polynomial time. #### Main Contributions: 1. **SHAP Value Computation for Weighted Automata**: It is proven that under the assumption that the background data generation distribution is a Markov distribution, the computation of SHAP values for the class of Weighted Automata is solvable (Section 3). 2. **SHAP Value Computation for Disjoint DNFs and Decision Trees**: Under the same assumption, it is proven that the computation of SHAP values for the classes of Disjoint DNFs and Decision Trees is also solvable (Section 4). These results go beyond the limitations of existing literature that rely solely on the feature independence assumption, providing a new perspective on understanding the computational complexity of SHAP values in practical applications.