Abstract:Setting proper evaluation objectives for explainable artificial intelligence (XAI) is vital for making XAI algorithms follow human communication norms, support human reasoning processes, and fulfill human needs for AI explanations. In this position paper, we examine the most pervasive human-grounded concept in XAI evaluation, explanation plausibility. Plausibility measures how reasonable the machine explanation is compared to the human explanation. Plausibility has been conventionally formulated as an important evaluation objective for AI explainability tasks. We argue against this idea, and show how optimizing and evaluating XAI for plausibility is sometimes harmful, and always ineffective in achieving model understandability, transparency, and trustworthiness. Specifically, evaluating XAI algorithms for plausibility regularizes the machine explanation to express exactly the same content as human explanation, which deviates from the fundamental motivation for humans to explain: expressing similar or alternative reasoning trajectories while conforming to understandable forms or language. Optimizing XAI for plausibility regardless of the model decision correctness also jeopardizes model trustworthiness, because doing so breaks an important assumption in human-human explanation that plausible explanations typically imply correct decisions, and vice versa; and violating this assumption eventually leads to either undertrust or overtrust of AI models. Instead of being the end goal in XAI evaluation, plausibility can serve as an intermediate computational proxy for the human process of interpreting explanations to optimize the utility of XAI. We further highlight the importance of explainability-specific evaluation objectives by differentiating the AI explanation task from the object localization task.

On the Failings of Shapley Values for Explainability

Explainability is NOT a Game

Error Analysis of Shapley Value-Based Model Explanations: An Informative Perspective

Refutation of Shapley Values for XAI -- Additional Evidence

Rational Shapley Values

A Perspective on Explainable Artificial Intelligence Methods: SHAP and LIME

Are Objective Explanatory Evaluation metrics Trustworthy? An Adversarial Analysis

Shapley-based Explainable AI for Clustering Applications in Fault Diagnosis and Prognosis

Explainable AI does not provide the explanations end-users are asking for

Why is plausibility surprisingly problematic as an XAI criterion?

Can Explainable AI Explain Unfairness? A Framework for Evaluating Explainable AI

From SHAP Scores to Feature Importance Scores

Explaining Predictive Uncertainty with Information Theoretic Shapley Values

X Hacking: The Threat of Misguided AutoML

Why do explanations fail? A typology and discussion on failures in XAI

Feature Inference Attack on Shapley Values

Precision of Individual Shapley Value Explanations

The XAI Alignment Problem: Rethinking How Should We Evaluate Human-Centered AI Explainability Techniques

A Human-Grounded Evaluation of SHAP for Alert Processing

Collective eXplainable AI: Explaining Cooperative Strategies and Agent Contribution in Multiagent Reinforcement Learning with Shapley Values

Shapley Lorenz Values for Artificial Intelligence Risk Management