Abstract:Explaining sophisticated machine-learning based systems is an important issue at the foundations of AI. Recent efforts have shown various methods for providing explanations. These approaches can be broadly divided into two schools: those that provide a local and human interpreatable approximation of a machine learning algorithm, and logical approaches that exactly characterise one aspect of the decision. In this paper we focus upon the second school of exact explanations with a rigorous logical foundation. There is an epistemological problem with these exact methods. While they can furnish complete explanations, such explanations may be too complex for humans to understand or even to write down in human readable form. Interpretability requires epistemically accessible explanations, explanations humans can grasp. Yet what is a sufficiently complete epistemically accessible explanation still needs clarification. We do this here in terms of counterfactuals, following [Wachter et al., 2017]. With counterfactual explanations, many of the assumptions needed to provide a complete explanation are left implicit. To do so, counterfactual explanations exploit the properties of a particular data point or sample, and as such are also local as well as partial explanations. We explore how to move from local partial explanations to what we call complete local explanations and then to global ones. But to preserve accessibility we argue for the need for partiality. This partiality makes it possible to hide explicit biases present in the algorithm that may be injurious or <a class="link-external link-http" href="http://unfair.We" rel="external noopener nofollow">this http URL</a> investigate how easy it is to uncover these biases in providing complete and fair explanations by exploiting the structure of the set of counterfactuals providing a complete local explanation.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to generate fair and adequate explanations, especially in the explanations of the prediction results of machine - learning algorithms. Specifically, the paper focuses on how to ensure the fairness and transparency of explanations by avoiding hiding or covering possible biases under the premise of ensuring the effectiveness and comprehensibility of explanations. ### Analysis of the Core Problems in the Paper 1. **Effectiveness and Comprehensibility of Explanations**: - **Effectiveness**: Explanations need to be able to accurately reflect the decision - making process of the machine - learning model and provide logically complete and effective information. - **Comprehensibility**: Explanations need to be concise and clear so that humans can easily understand and accept them. 2. **Avoiding Hidden Biases**: - The explanations of machine - learning models may hide certain unfair biases, such as the influence of sensitive factors such as gender and race. The method proposed in the paper aims to reveal these potential biases through counterfactual explanations. 3. **Fairness and Adequacy**: - **Fairness**: Explanations should not contain unfair biases against specific groups. - **Adequacy**: Explanations should be able to solve the confusion of the explainee and help them understand the decision - making process of the model. ### Main Contributions 1. **Definition and Properties of Counterfactual Explanations**: - The paper defines counterfactual explanations through the formal method of counterfactual logic and analyzes their role in explaining the behavior of machine - learning models. - Counterfactual explanations help the explainee understand the decision - making process of the model by assuming what will happen after certain conditions change. 2. **Formalization of Fairness and Adequacy**: - The paper proposes the concept of "fair and adequate explanations" and formalizes it through counterfactual logic and mathematical methods. - Fairness requires that explanations do not hide any unfair biases, and adequacy requires that explanations can solve the confusion of the explainee. 3. **Analysis of Algorithm Complexity**: - The paper analyzes the algorithm complexity of generating fair and adequate explanations and provides corresponding theoretical results. - Through the game - theory framework, the paper explores the computational complexity of finding fair and adequate explanations in a non - cooperative setting. ### Conclusion By combining logical theories and mathematical methods, the paper proposes a method for generating fair and adequate explanations, aiming to improve the transparency and fairness of the explanations of machine - learning models. This method not only helps to improve the interpretability of the model, but also can effectively avoid hidden biases and enhance the credibility and acceptability of the model.

Adequate and fair explanations

Explaining Explanations in AI

Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review

Explainability Is in the Mind of the Beholder: Establishing the Foundations of Explainable Artificial Intelligence

Even-if Explanations: Formal Foundations, Priorities and Complexity

Putting explainable AI in context: institutional explanations for medical AI

Disagreement amongst counterfactual explanations: How transparency can be deceptive

The Case Against Explainability

Helpful, Misleading or Confusing: How Humans Perceive Fundamental Building Blocks of Artificial Intelligence Explanations

On Explaining Unfairness: An Overview

Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations

Minimalistic Explanations: Capturing the Essence of Decisions

Explanation in artificial intelligence: Insights from the social sciences

Fairness and Explainability: Bridging the Gap Towards Fair Model Explanations

Global Counterfactual Explanations: Investigations, Implementations and Improvements

Explainability for fair machine learning

When Stability meets Sufficiency: Informative Explanations that do not Overwhelm

Sufficient and Necessary Explanations (and What Lies in Between)

Flexible and Context-Specific AI Explainability: A Multidisciplinary Approach

Evaluating Robustness of Counterfactual Explanations

Explaining Explanations: An Overview of Interpretability of Machine Learning