From Feature Importance to Natural Language Explanations Using LLMs with RAG

Sule Tekkesinoglu,Lars Kunze

2024-07-31

Abstract:As machine learning becomes increasingly integral to autonomous decision-making processes involving human interaction, the necessity of comprehending the model's outputs through conversational means increases. Most recently, foundation models are being explored for their potential as post hoc explainers, providing a pathway to elucidate the decision-making mechanisms of predictive models. In this work, we introduce traceable question-answering, leveraging an external knowledge repository to inform the responses of Large Language Models (LLMs) to user queries within a scene understanding task. This knowledge repository comprises contextual details regarding the model's output, containing high-level features, feature importance, and alternative probabilities. We employ subtractive counterfactual reasoning to compute feature importance, a method that entails analysing output variations resulting from decomposing semantic features. Furthermore, to maintain a seamless conversational flow, we integrate four key characteristics - social, causal, selective, and contrastive - drawn from social science research on human explanations into a single-shot prompt, guiding the response generation process. Our evaluation demonstrates that explanations generated by the LLMs encompassed these elements, indicating its potential to bridge the gap between complex model outputs and natural language expressions.

Artificial Intelligence,Computation and Language,Computer Vision and Pattern Recognition,Human-Computer Interaction,Machine Learning

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is to improve the transparency and interpretability of the decision - making process in machine - learning models, especially in autonomous decision - making systems involving human interaction. Specifically, the author proposes a traceable question - answering method, which combines external knowledge bases and large - language models (LLMs) to generate explanations in the form of natural language. This method aims to: 1. **Reduce object hallucination**: By using the model output information provided by external knowledge sources, reduce the errors or untrue information that LLMs may generate when producing explanations. 2. **Enhance the social, causal, selective and contrastive nature of explanations**: By integrating key elements in social science research, make the generated explanations more in line with the habits of human communication, thereby improving users' understanding and acceptance. 3. **Provide detailed feature - importance analysis**: Use subtractive counterfactual reasoning to calculate the importance of features, helping to understand which input features have a significant impact on the model's decision - making. 4. **Achieve a seamless conversation experience**: Ensure that the interaction with the user is smooth and natural, and guide LLMs to generate responses with the above - mentioned characteristics through a single prompt. Through these methods, the paper aims to bridge the gap between complex model outputs and natural - language expressions and promote more effective communication between humans and machines.

From Feature Importance to Natural Language Explanations Using LLMs with RAG

Towards Explainability in Retrieval-Augmented LLMs

From Understanding to Utilization: A Survey on Explainability for Large Language Models

LLMs for XAI: Future Directions for Explaining Explanations

Explingo: Explaining AI Predictions using Large Language Models

In-Context Explainers: Harnessing LLMs for Explaining Black Box Models

XplainLLM: A Knowledge-Augmented Dataset for Reliable Grounded Explanations in LLMs

Evaluating Explanations Through LLMs: Beyond Traditional User Studies

From large language models to small logic programs: building global explanations from disagreeing local post-hoc explainers

Post Hoc Explanations of Language Models Can Improve Language Models

Towards Uncovering How Large Language Model Works: An Explainability Perspective

Using LLMs for Explaining Sets of Counterfactual Examples to Final Users

Large Language Models as Evaluators for Recommendation Explanations

From Data to Commonsense Reasoning: The Use of Large Language Models for Explainable AI

LLM-Generated Black-box Explanations Can Be Adversarially Helpful

Faithful Explanations of Black-box NLP Models Using LLM-generated Counterfactuals

Explaining Agent Behavior with Large Language Models

Explaining Natural Language Processing Classifiers with Occlusion and Language Modeling

Scenarios and Approaches for Situated Natural Language Explanations

Verification and Refinement of Natural Language Explanations through LLM-Symbolic Theorem Proving

Why Would You Suggest That? Human Trust in Language Model Responses