Abstract:Machine learning models have undeniably achieved impressive performance across a range of applications. However, their often perceived black-box nature, and lack of transparency in decision-making, have raised concerns about understanding their predictions. To tackle this challenge, researchers have developed methods to provide explanations for machine learning models. In this paper, we introduce LaPLACE-explainer, designed to provide probabilistic cause-and-effect explanations for any classifier operating on tabular data, in a human-understandable manner. The LaPLACE-Explainer component leverages the concept of a Markov blanket to establish statistical boundaries between relevant and non-relevant features automatically. This approach results in the automatic generation of optimal feature subsets, serving as explanations for predictions. Importantly, this eliminates the need to predetermine a fixed number N of top features as explanations, enhancing the flexibility and adaptability of our methodology. Through the incorporation of conditional probabilities, our approach offers probabilistic causal explanations and outperforms LIME and SHAP (well-known model-agnostic explainers) in terms of local accuracy and consistency of explained features. LaPLACE's soundness, consistency, local accuracy, and adaptability are rigorously validated across various classification models. Furthermore, we demonstrate the practical utility of these explanations via experiments with both simulated and real-world datasets. This encompasses addressing trust-related issues, such as evaluating prediction reliability, facilitating model selection, enhancing trustworthiness, and identifying fairness-related concerns within classifiers.

The Intriguing Properties of Model Explanations

Accurate and Intuitive Contextual Explanations using Linear Model Trees

Are Visual Explanations Useful? A Case Study in Model-in-the-Loop Prediction

On the Relationship Between Explanation and Prediction: A Causal View

Explaining Explanations in AI

Learning Model Agnostic Explanations via Constraint Programming

Selective Explanations

Towards a Unified Framework for Evaluating Explanations

Explain, Edit, and Understand: Rethinking User Study Design for Evaluating Model Explanations

Respect the model: Fine-grained and Robust Explanation with Sharing Ratio Decomposition

Human-in-the-Loop Model Explanation via Verbatim Boundary Identification in Generated Neighborhoods

Causality-Aware Local Interpretable Model-Agnostic Explanations

Exploring local explanations of nonlinear models using animated linear projections

Dissenting Explanations: Leveraging Disagreement to Reduce Model Overreliance

Model Interpretation and Explainability: Towards Creating Transparency in Prediction Models

Considerations When Learning Additive Explanations for Black-Box Models

Minimalistic Explanations: Capturing the Essence of Decisions

Can I Trust the Explanations? Investigating Explainable Machine Learning Methods for Monotonic Models

LaPLACE: Probabilistic Local Model-Agnostic Causal Explanations

Explaining Decisions in ML Models: a Parameterized Complexity Analysis

Unified Explanations in Machine Learning Models: A Perturbation Approach