From Feature Importance to Natural Language Explanations Using LLMs with RAG

Sule Tekkesinoglu,Lars Kunze
2024-07-31
Abstract:As machine learning becomes increasingly integral to autonomous decision-making processes involving human interaction, the necessity of comprehending the model's outputs through conversational means increases. Most recently, foundation models are being explored for their potential as post hoc explainers, providing a pathway to elucidate the decision-making mechanisms of predictive models. In this work, we introduce traceable question-answering, leveraging an external knowledge repository to inform the responses of Large Language Models (LLMs) to user queries within a scene understanding task. This knowledge repository comprises contextual details regarding the model's output, containing high-level features, feature importance, and alternative probabilities. We employ subtractive counterfactual reasoning to compute feature importance, a method that entails analysing output variations resulting from decomposing semantic features. Furthermore, to maintain a seamless conversational flow, we integrate four key characteristics - social, causal, selective, and contrastive - drawn from social science research on human explanations into a single-shot prompt, guiding the response generation process. Our evaluation demonstrates that explanations generated by the LLMs encompassed these elements, indicating its potential to bridge the gap between complex model outputs and natural language expressions.
Artificial Intelligence,Computation and Language,Computer Vision and Pattern Recognition,Human-Computer Interaction,Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to improve the transparency and interpretability of the decision - making process in machine - learning models, especially in autonomous decision - making systems involving human interaction. Specifically, the author proposes a traceable question - answering method, which combines external knowledge bases and large - language models (LLMs) to generate explanations in the form of natural language. This method aims to: 1. **Reduce object hallucination**: By using the model output information provided by external knowledge sources, reduce the errors or untrue information that LLMs may generate when producing explanations. 2. **Enhance the social, causal, selective and contrastive nature of explanations**: By integrating key elements in social science research, make the generated explanations more in line with the habits of human communication, thereby improving users' understanding and acceptance. 3. **Provide detailed feature - importance analysis**: Use subtractive counterfactual reasoning to calculate the importance of features, helping to understand which input features have a significant impact on the model's decision - making. 4. **Achieve a seamless conversation experience**: Ensure that the interaction with the user is smooth and natural, and guide LLMs to generate responses with the above - mentioned characteristics through a single prompt. Through these methods, the paper aims to bridge the gap between complex model outputs and natural - language expressions and promote more effective communication between humans and machines.