XAI-FUNGI: Dataset resulting from the user study on comprehensibility of explainable AI algorithms

Szymon Bobek,Paloma Korycińska,Monika Krakowska,Maciej Mozolewski,Dorota Rak,Magdalena Zych,Magdalena Wójcik,Grzegorz J. Nalepa
2024-10-21
Abstract:This paper introduces a dataset that is the result of a user study on the comprehensibility of explainable artificial intelligence (XAI) algorithms. The study participants were recruited from 149 candidates to form three groups representing experts in the domain of mycology (DE), students with a data science and visualization background (IT) and students from social sciences and humanities (SSH). The main part of the dataset contains 39 transcripts of interviews during which participants were asked to complete a series of tasks and questions related to the interpretation of explanations of decisions of a machine learning model trained to distinguish between edible and inedible mushrooms. The transcripts were complemented with additional data that includes visualizations of explanations presented to the user, results from thematic analysis, recommendations of improvements of explanations provided by the participants, and the initial survey results that allow to determine the domain knowledge of the participant and data analysis literacy. The transcripts were manually tagged to allow for automatic matching between the text and other data related to particular fragments. In the advent of the area of rapid development of XAI techniques, the need for a multidisciplinary qualitative evaluation of explainability is one of the emerging topics in the community. Our dataset allows not only to reproduce the study we conducted, but also to open a wide range of possibilities for the analysis of the material we gathered.
Computers and Society,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the comprehensibility problem of Explainable Artificial Intelligence (XAI) algorithms. Specifically, the paper generates a dataset through a user study to evaluate the understanding ability of people from different backgrounds towards the explanations of XAI algorithms. The following are the main objectives and background of this study: 1. **Background and Motivation**: - With the development of black - box machine learning models (such as deep neural networks, gradient - boosted trees, etc.), these models are being more and more widely applied in high - risk fields (such as medicine, law, industry, etc.). However, the decision - making processes of these models are often opaque, leading to the need for their explanations. - Governments and scientific research institutions (such as DARPA's XAI Challenge, the EU's GDPR and AI Act) are also promoting the improvement of the transparency of AI systems to facilitate their applications in key areas. 2. **Research Questions**: - **The Comprehensibility of XAI Algorithms**: Although XAI algorithms have made certain progress, how to ensure that these explanations are understandable to humans is still a challenge. Different user groups (such as domain experts, data science students, social science students) may have significant differences in their understanding ability of explanations. - **Lack of Multidisciplinary Evaluation Methods**: Currently, there is a lack of interdisciplinary methods to measure and evaluate the comprehensibility of XAI algorithms, and also a lack of datasets to support such evaluations. 3. **Research Objectives**: - **Construct a Dataset**: Generate a dataset containing the understanding situations of users from different backgrounds towards XAI algorithm explanations through a user study. - **Evaluate the Comprehensibility of Explanations**: Evaluate the comprehensibility of different types of XAI explanations (such as SHAP, LIME, Anchor, DICE, etc.) by analyzing the performance of users when explaining the decision - making of mushroom classification models. - **Provide Improvement Suggestions**: Based on user feedback, put forward improvement suggestions to enhance the comprehensibility and practicality of XAI explanations. 4. **Research Methods**: - **User Grouping**: Recruit 39 participants from 149 candidates and divide them into three groups: domain experts in mycology (DE), students with data science and visualization backgrounds (IT), and social science research students (SSH). - **Task Design**: Participants need to complete a series of tasks, including answering questions about explanations, ranking explanations, and putting forward improvement suggestions. - **Data Collection**: Conduct interviews using the "Think - Aloud Protocol" and record the thinking processes and reactions of participants. The interview contents are transcribed and manually annotated to match with other relevant data. 5. **Dataset Contents**: - The dataset contains 39 interview transcripts, visual presentations of explanations, results of thematic analysis, improvement suggestions put forward by users, and initial survey results for determining the domain knowledge and data analysis ability of participants. Through this study, the author hopes to provide a valuable resource for researchers in the XAI field to explore and improve the transparency and comprehensibility of AI systems, especially in critical applications involving human interaction and trust.