Tell Me a Story! Narrative-Driven XAI with Large Language Models

David Martens,James Hinns,Camille Dams,Mark Vergouwen,Theodoros Evgeniou
2024-06-13
Abstract:In many AI applications today, the predominance of black-box machine learning models, due to their typically higher accuracy, amplifies the need for Explainable AI (XAI). Existing XAI approaches, such as the widely used SHAP values or counterfactual (CF) explanations, are arguably often too technical for users to understand and act upon. To enhance comprehension of explanations of AI decisions and the overall user experience, we introduce XAIstories, which leverage Large Language Models to provide narratives about how AI predictions are made: SHAPstories do so based on SHAP explanations, while CFstories do so for CF explanations. We study the impact of our approach on users' experience and understanding of AI predictions. Our results are striking: over 90% of the surveyed general audience finds the narratives generated by SHAPstories convincing. Data scientists primarily see the value of SHAPstories in communicating explanations to a general audience, with 83% of data scientists indicating they are likely to use SHAPstories for this purpose. In an image classification setting, CFstories are considered more or equally convincing as the users' own crafted stories by more than 75% of the participants. CFstories additionally bring a tenfold speed gain in creating a narrative. We also find that SHAPstories help users to more accurately summarize and understand AI decisions, in a credit scoring setting we test, correctly answering comprehension questions significantly more often than they do when only SHAP values are provided. The results thereby suggest that XAIstories may significantly help explaining and understanding AI predictions, ultimately supporting better decision-making in various applications.
Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in many AI applications, black - box machine - learning models are widely used due to their high accuracy, leading to an increased demand for Explainable AI (XAI). Existing XAI methods, such as the widely - used SHAP values or Counterfactual (CF) explanations, are often too technical, making it difficult for ordinary users to understand and apply. Therefore, the paper proposes **XAIstories**, which uses large - language models (LLMs) to generate narratives about how AI predictions are made, in order to enhance users' understanding of AI decisions and the overall user experience. Specifically: - **SHAPstories** generates narratives based on SHAP explanations. - **CFstories** generates narratives based on counterfactual explanations. ### Main contributions of the paper: 1. **Introduction of a new XAI method**: A new XAI method is proposed, that is, explaining AI predictions by generating natural - language narratives, thus opening up a new research area. 2. **Open - source implementation**: An open - source implementation based on GPT - 4 is provided for generating narratives for two commonly - used XAI methods (counterfactual and SHAP values). 3. **Improvement of user experience and understanding**: Through four survey studies, the significant effects of this method in improving user satisfaction, ease of use, confidence, and willingness to use are demonstrated, while also improving users' ability to understand explanations. ### Specific research content: - **Counterfactual explanations in image classification**: The SCAP (Semantic Counterfactuals for Accurate Picture) method is used to generate counterfactual explanations for image classification, and corresponding narratives are generated through CFstories. - **SHAP explanations in tabular data**: For tabular data, SHAP explanations are used to generate SHAPstories, covering three different datasets: FIFA 2018 best player prediction, student performance prediction, and German credit scoring. ### Survey design: - **User experience survey**: Four surveys were conducted for ordinary users and data scientists to evaluate the effects of XAIstories on user experience, decision support, and understanding. - **User understanding improvement survey**: Through quantitative evaluation, it is verified whether SHAPstories can improve users' understanding of AI decisions, especially in the credit - scoring decision - making scenario. ### Results: - **High user satisfaction**: More than 90% of ordinary users think that the narratives provided by SHAPstories are persuasive. - **Recognition by data scientists**: 83% of data scientists indicate that they may use SHAPstories to convey explanations to ordinary users. - **CFstories in image classification**: More than 75% of the participants think that the narratives of CFstories are as or more persuasive as those written by themselves. - **Improvement of user understanding**: In the credit - scoring scenario, users provided with SHAPstories perform better in answering understanding questions than those provided with only SHAP values. In conclusion, by introducing XAIstories, this paper aims to enhance users' understanding and trust in AI decisions through natural - language narratives, thereby supporting better decision - making.