Abstract:Local Interpretability Model-agnostic Explanations (LIME) is a well-known post-hoc technique for explaining black-box models. While very useful, recent research highlights challenges around the explanations generated. In particular, there is a potential lack of stability, where the explanations provided vary over repeated runs of the algorithm, casting doubt on their reliability. This paper investigates the stability of LIME when applied to multivariate time series classification. We demonstrate that the traditional methods for generating neighbours used in LIME carry a high risk of creating 'fake' neighbours, which are out-of-distribution in respect to the trained model and far away from the input to be explained. This risk is particularly pronounced for time series data because of their substantial temporal dependencies. We discuss how these out-of-distribution neighbours contribute to unstable explanations. Furthermore, LIME weights neighbours based on user-defined hyperparameters which are problem-dependent and hard to tune. We show how unsuitable hyperparameters can impact the stability of explanations. We propose a two-fold approach to address these issues. First, a generative model is employed to approximate the distribution of the training data set, from which within-distribution samples and thus meaningful neighbours can be created for LIME. Second, an adaptive weighting method is designed in which the hyperparameters are easier to tune than those of the traditional method. Experiments on real-world data sets demonstrate the effectiveness of the proposed method in providing more stable explanations using the LIME framework. In addition, in-depth discussions are provided on the reasons behind these results.

Local Interpretable Model-agnostic Explanations of Bayesian Predictive Models via Kullback-Leibler Projections

Locally Invariant Explanations: Towards Stable and Unidirectional Explanations through Local Invariant Learning

BMB-LIME: LIME with modeling local nonlinearity and uncertainty in explainability

Local Interpretable Model-Agnostic Explanations for Multitarget Image Regression.

LaPLACE: Probabilistic Local Model-Agnostic Causal Explanations

G-LIME: Statistical Learning for Local Interpretations of Deep Neural Networks Using Global Priors.

An Extension of LIME with Improvement of Interpretability and Fidelity

Local Interpretable Model Agnostic Shap Explanations for machine learning models

DLIME: A Deterministic Local Interpretable Model-Agnostic Explanations Approach for Computer-Aided Diagnosis Systems

Exploring local explanations of nonlinear models using animated linear projections

EBLIME: Enhanced Bayesian Local Interpretable Model-agnostic Explanations

LIMEtree: Consistent and Faithful Multi-class Explanations

GraphLIME: Local Interpretable Model Explanations for Graph Neural Networks

GLIME: General, Stable and Local LIME Explanation

Stable local interpretable model-agnostic explanations based on a variational autoencoder

Explainability in Neural Networks for Natural Language Processing Tasks

LIMIS: Locally Interpretable Modeling using Instance-wise Subsampling

Explaining the Predictions of Any Image Classifier via Decision Trees

Explaining machine learning models using entropic variable projection

In-Context Explainers: Harnessing LLMs for Explaining Black Box Models

SEGAL time series classification - Stable explanations using a generative model and an adaptive weighting method for LIME