Abstract:Recent studies evaluating various criteria for explainable artificial intelligence (XAI) suggest that fidelity, stability, and comprehensibility are among the most important metrics considered by users of AI across a diverse collection of usage contexts. We consider these criteria as applied to feature-based attribution methods, which are amongst the most prevalent in XAI literature. Going beyond standard correlation, methods have been proposed that highlight what should be minimally sufficient to justify the classification of an input (viz. pertinent positives). While minimal sufficiency is an attractive property akin to comprehensibility, the resulting explanations are often too sparse for a human to understand and evaluate the local behavior of the model. To overcome these limitations, we incorporate the criteria of stability and fidelity and propose a novel method called Path-Sufficient Explanations Method (PSEM) that outputs a sequence of stable and sufficient explanations for a given input of strictly decreasing size (or value) -- from original input to a minimally sufficient explanation -- which can be thought to trace the local boundary of the model in a stable manner, thus providing better intuition about the local model behavior for the specific input. We validate these claims, both qualitatively and quantitatively, with experiments that show the benefit of PSEM across three modalities (image, tabular and text) as well as versus other path explanations. A user study depicts the strength of the method in communicating the local behavior, where (many) users are able to correctly determine the prediction made by a model.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to provide an explanation path that is neither redundant nor overly simplified while ensuring the stability, fidelity, and comprehensibility of the explanation in the process of explaining the decisions of artificial intelligence (AI) models. Specifically, the paper focuses on feature - based attribution methods, which are very common in the explainable artificial intelligence (XAI) literature. The authors propose a new method - Path - Sufficient Explanations Method (PSEM), aiming to generate a series of stable and sufficient explanations for a given input. These explanations gradually reduce from the original input to the minimum sufficient explanation, so that the local boundaries of the model can be traced in a more stable manner, providing a better intuition of the local model behavior for a specific input. ### Main contributions of the paper 1. **Propose a novel (constrained) formula to learn a sequence of stable sufficient explanations**: This method not only considers the sufficiency of the explanation but also introduces stability and sparsity as optimization goals, ensuring that each step of the explanation path is stable and gradually decreasing. 2. **Propose a method to effectively solve the optimization problem**: By customizing the alternating minimization algorithm, the optimization challenges in the path - sufficient explanation problem are effectively solved. 3. **Quantitatively demonstrate the advantages of path explanations**: By adapting known post - hoc explanation methods, the performance improvement of PSEM on multiple modalities (images, tabular data, and text) is verified. 4. **Applications across multiple modalities**: Unlike previous methods that are only applied to specific modalities, PSEM can be applied to three different modalities: images, tabular data, and text. 5. **Demonstrate the value of the method through user studies on standard tasks**: The user study not only tests the understanding of the explanations but also evaluates the users' trust in the explanations. The results show that PSEM has significant advantages in improving the comprehensibility and trustworthiness of the explanations. ### Specific problems solved - **Stability of explanations**: Many existing explanation methods may lead to unstable explanations when adjusting the sparsity parameter, that is, the feature sets of the explanations fluctuate greatly at different sparsity levels. PSEM ensures that each step of the explanation path is stable by introducing stability constraints. - **Sufficiency of explanations**: Traditional minimum sufficient explanations (Pertinent Positives, PP) can provide the minimum feature set to maintain the model's decision, but these explanations are often too simple to be understood and evaluated by humans. PSEM provides more intermediate steps by generating a series of gradually decreasing explanations, making the explanations more abundant and comprehensible. - **Comprehensibility of explanations**: The explanation path generated by PSEM is not only stable but also maintains the prediction consistency of the model at each step, enabling users to better understand the local behavior of the model. In summary, by proposing the PSEM method, this paper solves the deficiencies of existing explanation methods in terms of stability, sufficiency, and comprehensibility, providing users with a more reliable and intuitive explanation tool.

When Stability meets Sufficiency: Informative Explanations that do not Overwhelm

Minimalistic Explanations: Capturing the Essence of Decisions

Sufficient and Necessary Explanations (and What Lies in Between)

Selective Explanations

Helpful, Misleading or Confusing: How Humans Perceive Fundamental Building Blocks of Artificial Intelligence Explanations

Adequate and fair explanations

Incremental XAI: Memorable Understanding of AI with Incremental Explanations

Locally-Minimal Probabilistic Explanations

"Explanation" is Not a Technical Term: The Problem of Ambiguity in XAI

Can I Trust the Explainer? Verifying Post-hoc Explanatory Methods

"Is your explanation stable?": A Robustness Evaluation Framework for Feature Attribution

Distance-Restricted Explanations: Theoretical Underpinnings & Efficient Implementation

From Robustness to Explainability and Back Again

Are Objective Explanatory Evaluation metrics Trustworthy? An Adversarial Analysis

Altruist: Argumentative Explanations through Local Interpretations of Predictive Models

Good Explanations in Explainable Artificial Intelligence (XAI): Evidence from Human Explanatory Reasoning

Towards a Unified Framework for Evaluating Explanations

Provably Better Explanations with Optimized Aggregation of Feature Attributions

Multi-objective Feature Attribution Explanation For Explainable Machine Learning

F-Fidelity: A Robust Framework for Faithfulness Evaluation of Explainable AI

A Psychological Theory of Explainability