Abstract:There has been a recent resurgence of interest in explainable artificial intelligence (XAI) that aims to reduce the opaqueness of AI-based decision-making systems, allowing humans to scrutinize and trust them. Prior work in this context has focused on the attribution of responsibility for an algorithm's decisions to its inputs wherein responsibility is typically approached as a purely associational concept. In this paper, we propose a principled causality-based approach for explaining black-box decision-making systems that addresses limitations of existing methods in XAI. At the core of our framework lies probabilistic contrastive counterfactuals, a concept that can be traced back to philosophical, cognitive, and social foundations of theories on how humans generate and select explanations. We show how such counterfactuals can quantify the direct and indirect influences of a variable on decisions made by an algorithm, and provide actionable recourse for individuals negatively affected by the algorithm's decision. Unlike prior work, our system, LEWIS: (1)can compute provably effective explanations and recourse at local, global and contextual levels (2)is designed to work with users with varying levels of background knowledge of the underlying causal model and (3)makes no assumptions about the internals of an algorithmic system except for the availability of its input-output data. We empirically evaluate LEWIS on three real-world datasets and show that it generates human-understandable explanations that improve upon state-of-the-art approaches in XAI, including the popular LIME and SHAP. Experiments on synthetic data further demonstrate the correctness of LEWIS's explanations and the scalability of its recourse algorithm.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to explain the opacity in the decision - making process of black - box algorithms, especially by proposing a causality - based method to generate explainable artificial intelligence (XAI) systems. Specifically, the paper proposes a new framework that uses probabilistic contrastive counterfactuals to explain the decision - making process of black - box decision - making systems, aiming to address the limitations of existing XAI methods, such as only focusing on the correlation rather than the causal relationship between input and output, and being unable to provide practical remedial measures. ### Core contributions of the paper: 1. **Proposing novel probabilistic explanation scores**: The paper introduces necessity scores and sufficiency scores, which respectively quantify the necessity and sufficiency of an attribute in algorithmic decision - making. These scores are defined in the form of contrastive counterfactuals and can more accurately reflect the influence of attributes on decision - making. 2. **Generating multiple types of explanations**: Based on the above - mentioned explanation scores, the system LEWIS proposed in the paper can generate global, local, and situational explanations, helping users understand the decision - making logic of the algorithm at different levels. For example, LEWIS can explain the importance of a particular attribute in the entire dataset or the impact on the decision - making for a specific individual. 3. **Providing practical remedial measures**: For individuals negatively affected by the algorithm, LEWIS can generate minimum - intervention suggestions, that is, by changing certain operable attribute values, to change the decision - making result of the algorithm with a high probability. These remedial measures not only help improve the transparency of the algorithm but also can provide specific action guidelines for users. 4. **Theoretical basis and practical verification**: The paper theoretically proves that under certain conditions, these probabilistic contrastive counterfactuals can be estimated or bounded by historical data. In addition, the paper also experimentally proves the effectiveness and accuracy of LEWIS on real and synthetic datasets, demonstrating its advantages in explanatory power and remedial - measure generation. ### Specific problem solutions: - **The problem that the paper attempts to solve**: The main problem solved by the paper is the opacity and lack of interpretability in the decision - making process of black - box algorithms. Existing XAI methods are often only able to provide explanations of the correlation between input and output, ignoring the causal relationship, resulting in less accurate explanations and difficulty in providing practical remedial measures. By introducing the concept of probabilistic contrastive counterfactuals, the paper proposes a new framework that can generate more accurate and comprehensive explanations, thereby increasing the transparency and credibility of the algorithm. - **The main method of the paper**: The paper proposes an explanation framework based on causality, with the core concept of probabilistic contrastive counterfactuals. By calculating necessity scores and sufficiency scores, the paper can quantify the influence of attributes on algorithmic decision - making and generate different types of explanations. In addition, the paper also develops an optimization algorithm for generating practical remedial measures. - **The innovation points of the paper**: The main innovation of the paper lies in introducing causality into XAI. Through the concept of probabilistic contrastive counterfactuals, it provides a more accurate and comprehensive explanation method. At the same time, the paper also solves the problem of how to estimate these counterfactuals using historical data when some background knowledge is known, improving the practicality and operability of the method.

Explaining Black-Box Algorithms Using Probabilistic Contrastive Counterfactuals

"How do I fool you?": Manipulating User Trust via Misleading Black Box Explanations

Explainable AI without Interpretable Model

Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review

Critical Empirical Study on Black-box Explanations in AI

Explaining the Behavior of Black-Box Prediction Algorithms with Causal Learning

Interpreting Black-Box Models: A Review on Explainable Artificial Intelligence

Disagreement amongst counterfactual explanations: How transparency can be deceptive

A Survey of Contrastive and Counterfactual Explanation Generation Methods for Explainable Artificial Intelligence

Explainable Artificial Intelligence Approaches: A Survey

Solving the Black Box Problem: A Normative Framework for Explainable Artificial Intelligence

Explainability Is in the Mind of the Beholder: Establishing the Foundations of Explainable Artificial Intelligence

Fool Me Once? Contrasting Textual and Visual Explanations in a Clinical Decision-Support Setting

CLIMAX: An exploration of Classifier-Based Contrastive Explanations

Counterfactual Explanations of Black-box Machine Learning Models using Causal Discovery with Applications to Credit Rating

The black box problem revisited. Real and imaginary challenges for automated legal decision making

Logic-Based Explainability: Past, Present & Future

Explainability of Artificial Intelligence Methods, Applications and Challenges: A Comprehensive Survey

A Psychological Theory of Explainability

The effects of explainability and causability on perception, trust, and acceptance: Implications for explainable AI

A Survey of the Various Methodologies Towards making Artificial Intelligence More Explainable