Adequate and fair explanations

Nicholas Asher,Soumya Paul,Chris Russell
DOI: https://doi.org/10.1007/978-3-030-84060-0
2021-08-21
Abstract:Explaining sophisticated machine-learning based systems is an important issue at the foundations of AI. Recent efforts have shown various methods for providing explanations. These approaches can be broadly divided into two schools: those that provide a local and human interpreatable approximation of a machine learning algorithm, and logical approaches that exactly characterise one aspect of the decision. In this paper we focus upon the second school of exact explanations with a rigorous logical foundation. There is an epistemological problem with these exact methods. While they can furnish complete explanations, such explanations may be too complex for humans to understand or even to write down in human readable form. Interpretability requires epistemically accessible explanations, explanations humans can grasp. Yet what is a sufficiently complete epistemically accessible explanation still needs clarification. We do this here in terms of counterfactuals, following [Wachter et al., 2017]. With counterfactual explanations, many of the assumptions needed to provide a complete explanation are left implicit. To do so, counterfactual explanations exploit the properties of a particular data point or sample, and as such are also local as well as partial explanations. We explore how to move from local partial explanations to what we call complete local explanations and then to global ones. But to preserve accessibility we argue for the need for partiality. This partiality makes it possible to hide explicit biases present in the algorithm that may be injurious or <a class="link-external link-http" href="http://unfair.We" rel="external noopener nofollow">this http URL</a> investigate how easy it is to uncover these biases in providing complete and fair explanations by exploiting the structure of the set of counterfactuals providing a complete local explanation.
Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to generate fair and adequate explanations, especially in the explanations of the prediction results of machine - learning algorithms. Specifically, the paper focuses on how to ensure the fairness and transparency of explanations by avoiding hiding or covering possible biases under the premise of ensuring the effectiveness and comprehensibility of explanations. ### Analysis of the Core Problems in the Paper 1. **Effectiveness and Comprehensibility of Explanations**: - **Effectiveness**: Explanations need to be able to accurately reflect the decision - making process of the machine - learning model and provide logically complete and effective information. - **Comprehensibility**: Explanations need to be concise and clear so that humans can easily understand and accept them. 2. **Avoiding Hidden Biases**: - The explanations of machine - learning models may hide certain unfair biases, such as the influence of sensitive factors such as gender and race. The method proposed in the paper aims to reveal these potential biases through counterfactual explanations. 3. **Fairness and Adequacy**: - **Fairness**: Explanations should not contain unfair biases against specific groups. - **Adequacy**: Explanations should be able to solve the confusion of the explainee and help them understand the decision - making process of the model. ### Main Contributions 1. **Definition and Properties of Counterfactual Explanations**: - The paper defines counterfactual explanations through the formal method of counterfactual logic and analyzes their role in explaining the behavior of machine - learning models. - Counterfactual explanations help the explainee understand the decision - making process of the model by assuming what will happen after certain conditions change. 2. **Formalization of Fairness and Adequacy**: - The paper proposes the concept of "fair and adequate explanations" and formalizes it through counterfactual logic and mathematical methods. - Fairness requires that explanations do not hide any unfair biases, and adequacy requires that explanations can solve the confusion of the explainee. 3. **Analysis of Algorithm Complexity**: - The paper analyzes the algorithm complexity of generating fair and adequate explanations and provides corresponding theoretical results. - Through the game - theory framework, the paper explores the computational complexity of finding fair and adequate explanations in a non - cooperative setting. ### Conclusion By combining logical theories and mathematical methods, the paper proposes a method for generating fair and adequate explanations, aiming to improve the transparency and fairness of the explanations of machine - learning models. This method not only helps to improve the interpretability of the model, but also can effectively avoid hidden biases and enhance the credibility and acceptability of the model.