Dermatologist-like explainable AI enhances melanoma diagnosis accuracy: eye-tracking study

Tirtha Chanda,Sarah Haggenmueller,Tabea-Clara Bucher,Tim Holland-Letz,Harald Kittler,Philipp Tschandl,Markus V. Heppt,Carola Berking,Jochen S. Utikal,Bastian Schilling,Claudia Buerger,Cristian Navarrete-Dechent,Matthias Goebeler,Jakob Nikolas Kather,Carolin V. Schneider,Benjamin Durani,Hendrike Durani,Martin Jansen,Juliane Wacker,Joerg Wacker,Reader Study Consortium,Titus J. Brinker
2024-09-20
Abstract:Artificial intelligence (AI) systems have substantially improved dermatologists' diagnostic accuracy for melanoma, with explainable AI (XAI) systems further enhancing clinicians' confidence and trust in AI-driven decisions. Despite these advancements, there remains a critical need for objective evaluation of how dermatologists engage with both AI and XAI tools. In this study, 76 dermatologists participated in a reader study, diagnosing 16 dermoscopic images of melanomas and nevi using an XAI system that provides detailed, domain-specific explanations. Eye-tracking technology was employed to assess their interactions. Diagnostic performance was compared with that of a standard AI system lacking explanatory features. Our findings reveal that XAI systems improved balanced diagnostic accuracy by 2.8 percentage points relative to standard AI. Moreover, diagnostic disagreements with AI/XAI systems and complex lesions were associated with elevated cognitive load, as evidenced by increased ocular fixations. These insights have significant implications for clinical practice, the design of AI tools for visual tasks, and the broader development of XAI in medical diagnostics.
Artificial Intelligence,Computer Vision and Pattern Recognition,Human-Computer Interaction
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to evaluate how interpretable artificial intelligence (XAI) systems affect dermatologists' accuracy and visual attention patterns in diagnosing melanoma. Specifically, the research aims to objectively analyze the differences in diagnostic performance and visual attention allocation of dermatologists when using standard AI systems and XAI systems through eye - tracking technology. The paper focuses on the following aspects: 1. **Diagnostic accuracy**: Compare the diagnostic accuracy of dermatologists when using standard AI systems and XAI systems, especially whether the XAI system can significantly improve doctors' diagnostic capabilities. 2. **Visual attention patterns**: Analyze the visual attention allocation of dermatologists when using AI and XAI systems through eye - tracking technology, especially the changes in visual attention when doctors' diagnoses are inconsistent with the predictions of the AI system. 3. **Cognitive load**: Explore the cognitive load of doctors when dealing with complex cases or cases inconsistent with the predictions of the AI system, measured by the number of fixations and fixation time in eye - tracking data. 4. **Impact of experience level**: Analyze the performance differences of dermatologists with different experience levels when using AI and XAI systems, especially whether experienced doctors can use these systems more effectively. Through these analyses, the paper hopes to provide valuable insights for the design and application of AI tools in clinical practice, especially how to enhance doctors' trust and acceptance through XAI systems, thereby improving the accuracy and reliability of diagnosis.