Counterfactual explanations and algorithmic recourses for machine learning: A review

Sahil Verma, Varich Boonsanong, Minh Hoang, Keegan Hines, John Dickerson, Chirag Shah
2024-10-03
Abstract:Machine learning plays a role in many deployed decision systems, often in ways that are difficult or impossible to understand by human stakeholders. Explaining, in a human-understandable way, the relationship between the input and output of machine learning models is essential to the development of trustworthy machine learning based systems. A burgeoning body of research seeks to define the goals and methods of explainability in machine learning. In this article, we seek to review and categorize research on counterfactual explanations, a specific class of explanation that provides a link between what could have happened had input to a model been changed in a particular way. Modern approaches to counterfactual explainability in machine learning draw connections to the established legal doctrine in many countries, making them appealing to fielded systems in high-impact areas such as finance and …
What problem does this paper attempt to address?