Abstract:Deep Learning has shown outstanding results in computer vision tasks; healthcare is no exception. However, there is no straightforward way to expose the decision-making process of DL models. Good accuracy is not enough for skin cancer predictions. Understanding the model's behavior is crucial for clinical application and reliable outcomes. In this work, we identify desiderata for explanations in skin-lesion models. We analyzed seven methods, four based on pixel-attribution (Grad-CAM, Score-CAM, LIME, SHAP) and three on high-level concepts (ACE, ICE, CME), for a deep neural network trained on the International Skin Imaging Collaboration Archive. Our findings indicate that while these techniques reveal biases, there is room for improving the comprehensiveness of explanations to achieve transparency in skin-lesion models.

What problem does this paper attempt to address?

The key problem that this paper attempts to solve is the interpretability of deep - learning models in skin lesion classification. Specifically, although deep - learning (DL) models perform excellently in computer vision tasks, in the medical field, especially in skin cancer prediction, having only a high accuracy rate is not enough. Understanding the decision - making process of the model is crucial for clinical applications and ensuring reliable diagnosis results. Therefore, the main objective of this paper is to evaluate and compare different interpretability methods to improve the transparency and credibility of skin lesion classification models. ### Core Problems of the Paper 1. **Insufficient Model Transparency**: Deep - learning models are usually regarded as "black boxes", and it is difficult to understand their decision - making processes. Especially in the medical field, this opacity may lead to misdiagnosis or trust issues. 2. **Low - Quality Explanations**: Although existing explanation methods can reveal some biases, they fail to provide comprehensive enough explanations, making it difficult for doctors and patients to fully trust the model's prediction results. 3. **Lack of Systematic Evaluation**: Currently, there is no unified standard to evaluate the effectiveness and reliability of different explanation methods, resulting in difficulty in choosing the most suitable method in practical applications. ### Research Contents To address the above problems, this paper conducts the following research: - **Defining Explanation Attributes**: Proposes the attributes that explanations of skin lesion classification models should possess, including fidelity, meaningfulness, and effectiveness. - **Method Comparison**: Analyzes seven explanation methods, four of which are based on pixel attribution (Grad - CAM, Score - CAM, LIME, SHAP), and three are based on high - level concepts (ACE, ICE, CME). These methods are applied to a deep neural network trained on the International Skin Imaging Collaboration Archive (ISIC). - **Result Evaluation**: Evaluates the performance of these methods qualitatively and quantitatively and discusses their respective advantages and disadvantages. ### Main Findings - **Limitations of Existing Methods**: Although these explanation methods can reveal some biases, most of them still have deficiencies, especially in providing comprehensive and transparent explanations. - **Combined Use of Multiple Methods**: A single method is difficult to meet all explanation requirements. Combining multiple explanation methods can compensate for their respective shortcomings and provide a more comprehensive understanding. - **Future Directions**: It is recommended to further study how doctors perceive and interpret these explanation methods and extend the research to other architectures and datasets. ### Conclusion This paper emphasizes that although existing explanation methods can help understand the decision - making process of deep - learning models to a certain extent, further improvement and innovation are still required to achieve truly transparent and trustworthy skin lesion classification models. Combining multiple explanation methods and conducting systematic evaluation are important directions for future research.

Are Explanations Helpful? A Comparative Analysis of Explainability Methods in Skin Lesion Classifiers

Quantifying Explainable AI Methods in Medical Diagnosis: A study in skin cancer

Explainable Deep Image Classifiers for Skin Lesion Diagnosis

Improving trust and confidence in medical skin lesion diagnosis through explainable deep learning

Advancing Dermatological Diagnostics: Interpretable AI for Enhanced Skin Lesion Classification

Interpretable machine learning for dermatological disease detection: Bridging the gap between accuracy and explainability

Achievements and Challenges in Explaining Deep Learning based Computer-Aided Diagnosis Systems

Coherent Concept-based Explanations in Medical Image and Its Application to Skin Lesion Diagnosis

Pixel-Level Explanation of Multiple Instance Learning Models in Biomedical Single Cell Images

Exemplars and Counterexemplars Explanations for Image Classifiers, Targeting Skin Lesion Labeling

Explainability in CNN based Deep Learning models for medical image classification

Evaluating Machine Learning-based Skin Cancer Diagnosis

Aligning Characteristic Descriptors with Images for Human-Expert-like Explainability

Explainable Artificial Intelligence for Human Decision-Support System in Medical Domain

Aligning Human Knowledge with Visual Concepts Towards Explainable Medical Image Classification

Quantitative and Qualitative Evaluation of Explainable Deep Learning Methods for Ophthalmic Diagnosis

Minimalistic Explanations: Capturing the Essence of Decisions

Improving Interpretability of Deep Neural Networks in Medical Diagnosis by Investigating the Individual Units

Evaluating the Explainability of Attributes and Prototypes for a Medical Classification Model

LCE: A Framework for Explainability of DNNs for Ultrasound Image Based on Concept Discovery

MICA: Towards Explainable Skin Lesion Diagnosis via Multi-Level Image-Concept Alignment