Abstract:Explaining sophisticated machine-learning based systems is an important issue at the foundations of AI. Recent efforts have shown various methods for providing explanations. These approaches can be broadly divided into two schools: those that provide a local and human interpreatable approximation of a machine learning algorithm, and logical approaches that exactly characterise one aspect of the decision. In this paper we focus upon the second school of exact explanations with a rigorous logical foundation. There is an epistemological problem with these exact methods. While they can furnish complete explanations, such explanations may be too complex for humans to understand or even to write down in human readable form. Interpretability requires epistemically accessible explanations, explanations humans can grasp. Yet what is a sufficiently complete epistemically accessible explanation still needs clarification. We do this here in terms of counterfactuals, following [Wachter et al., 2017]. With counterfactual explanations, many of the assumptions needed to provide a complete explanation are left implicit. To do so, counterfactual explanations exploit the properties of a particular data point or sample, and as such are also local as well as partial explanations. We explore how to move from local partial explanations to what we call complete local explanations and then to global ones. But to preserve accessibility we argue for the need for partiality. This partiality makes it possible to hide explicit biases present in the algorithm that may be injurious or <a class="link-external link-http" href="http://unfair.We" rel="external noopener nofollow">this http URL</a> investigate how easy it is to uncover these biases in providing complete and fair explanations by exploiting the structure of the set of counterfactuals providing a complete local explanation.

Counterfactual explanations and algorithmic recourses for machine learning: A review

Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review

Counterfactual Explanations for Machine Learning: Challenges Revisited

Counterfactual explanations and how to find them: literature review and benchmarking

On the computation of counterfactual explanations -- A survey

Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations

Exploring Counterfactual Explanations Through the Lens of Adversarial Examples: A Theoretical and Empirical Analysis

Counterfactual Explanations via Locally-guided Sequential Algorithmic Recourse

The Use and Misuse of Counterfactuals in Ethical Machine Learning

Redefining Counterfactual Explanations for Reinforcement Learning: Overview, Challenges and Opportunities

Adequate and fair explanations

Robust Counterfactual Explanations in Machine Learning: A Survey

Disagreement amongst counterfactual explanations: How transparency can be deceptive

Explaining Explanations: An Overview of Interpretability of Machine Learning

A Survey of the Various Methodologies Towards making Artificial Intelligence More Explainable

A Series of Unfortunate Counterfactual Events: the Role of Time in Counterfactual Explanations

A Survey of Contrastive and Counterfactual Explanation Generation Methods for Explainable Artificial Intelligence

Machine Explanations and Human Understanding

Features of Explainability: How users understand counterfactual and causal explanations for categorical and continuous features in XAI

Counterfactual Explanations for Support Vector Machine Models

Explain To Decide: A Human-Centric Review on the Role of Explainable Artificial Intelligence in AI-assisted Decision Making