To Trust or Not to Trust: Towards a novel approach to measure trust for XAI systems

Miquel Miró-Nicolau,Gabriel Moyà-Alcover,Antoni Jaume-i-Capó,Manuel González-Hidalgo,Maria Gemma Sempere Campello,Juan Antonio Palmer Sancho
2024-05-09
Abstract:The increasing reliance on Deep Learning models, combined with their inherent lack of transparency, has spurred the development of a novel field of study known as eXplainable AI (XAI) methods. These methods seek to enhance the trust of end-users in automated systems by providing insights into the rationale behind their decisions. This paper presents a novel approach for measuring user trust in XAI systems, allowing their refinement. Our proposed metric combines both performance metrics and trust indicators from an objective perspective. To validate this novel methodology, we conducted a case study in a realistic medical scenario: the usage of XAI system for the detection of pneumonia from x-ray images.
Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
This paper discusses how to establish a new method to measure users' trust in Explainable Artificial Intelligence (XAI) systems in the context of the widespread application of Artificial Intelligence (AI), especially deep learning models. Due to the opacity of deep learning models, XAI methods have emerged to enhance users' trust in automated systems by providing insights into the decisions made. The paper proposes a new metric to measure users' trust in XAI systems, combining performance metrics and objective trust indicators. The paper conducts a case study in a real medical scenario, using an XAI system to detect pneumonia in chest X-ray images, to validate the effectiveness of this method. The authors emphasize the importance of trust measurement and point out the need for objective evaluation despite trust being subjective. They cite other studies discussing different types of trust metrics, such as fidelity, robustness, and user trust, and propose a new measurement criterion based on classification metrics such as True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN). In the paper, the authors design three hypothetical scenarios to analyze the effectiveness of the new method and apply this method on a real medical dataset using an AI model (ResNet18) and an explanation method (GradCAM) to detect COVID-19 pneumonia. The results demonstrate that the new method is able to reveal users' trust in the system, particularly the differences in trust when identifying correct and incorrect predictions. Overall, the paper aims to address the problem of developing a new method to measure users' trust in XAI systems and prove its applicability and effectiveness in sensitive fields like medical diagnosis through empirical research.