Are Explanations Helpful? A Comparative Analysis of Explainability Methods in Skin Lesion Classifiers

Rosa Y. G. Paccotacya-Yanque,Alceu Bissoto,Sandra Avila
2024-12-04
Abstract:Deep Learning has shown outstanding results in computer vision tasks; healthcare is no exception. However, there is no straightforward way to expose the decision-making process of DL models. Good accuracy is not enough for skin cancer predictions. Understanding the model's behavior is crucial for clinical application and reliable outcomes. In this work, we identify desiderata for explanations in skin-lesion models. We analyzed seven methods, four based on pixel-attribution (Grad-CAM, Score-CAM, LIME, SHAP) and three on high-level concepts (ACE, ICE, CME), for a deep neural network trained on the International Skin Imaging Collaboration Archive. Our findings indicate that while these techniques reveal biases, there is room for improving the comprehensiveness of explanations to achieve transparency in skin-lesion models.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The key problem that this paper attempts to solve is the interpretability of deep - learning models in skin lesion classification. Specifically, although deep - learning (DL) models perform excellently in computer vision tasks, in the medical field, especially in skin cancer prediction, having only a high accuracy rate is not enough. Understanding the decision - making process of the model is crucial for clinical applications and ensuring reliable diagnosis results. Therefore, the main objective of this paper is to evaluate and compare different interpretability methods to improve the transparency and credibility of skin lesion classification models. ### Core Problems of the Paper 1. **Insufficient Model Transparency**: Deep - learning models are usually regarded as "black boxes", and it is difficult to understand their decision - making processes. Especially in the medical field, this opacity may lead to misdiagnosis or trust issues. 2. **Low - Quality Explanations**: Although existing explanation methods can reveal some biases, they fail to provide comprehensive enough explanations, making it difficult for doctors and patients to fully trust the model's prediction results. 3. **Lack of Systematic Evaluation**: Currently, there is no unified standard to evaluate the effectiveness and reliability of different explanation methods, resulting in difficulty in choosing the most suitable method in practical applications. ### Research Contents To address the above problems, this paper conducts the following research: - **Defining Explanation Attributes**: Proposes the attributes that explanations of skin lesion classification models should possess, including fidelity, meaningfulness, and effectiveness. - **Method Comparison**: Analyzes seven explanation methods, four of which are based on pixel attribution (Grad - CAM, Score - CAM, LIME, SHAP), and three are based on high - level concepts (ACE, ICE, CME). These methods are applied to a deep neural network trained on the International Skin Imaging Collaboration Archive (ISIC). - **Result Evaluation**: Evaluates the performance of these methods qualitatively and quantitatively and discusses their respective advantages and disadvantages. ### Main Findings - **Limitations of Existing Methods**: Although these explanation methods can reveal some biases, most of them still have deficiencies, especially in providing comprehensive and transparent explanations. - **Combined Use of Multiple Methods**: A single method is difficult to meet all explanation requirements. Combining multiple explanation methods can compensate for their respective shortcomings and provide a more comprehensive understanding. - **Future Directions**: It is recommended to further study how doctors perceive and interpret these explanation methods and extend the research to other architectures and datasets. ### Conclusion This paper emphasizes that although existing explanation methods can help understand the decision - making process of deep - learning models to a certain extent, further improvement and innovation are still required to achieve truly transparent and trustworthy skin lesion classification models. Combining multiple explanation methods and conducting systematic evaluation are important directions for future research.