Accurate Skin Lesion Classification Using Multimodal Learning on the HAM10000 Dataset

Abdulmateen Adebiyi,Nader Abdalnabi,Emily Smith Hoffman,Jesse Hirner,Eduardo Simoes,Mirna Becevic,Praveen Rao
DOI: https://doi.org/10.1101/2024.05.30.24308213
2024-08-28
Abstract:Objectives: Our aim is to evaluate the performance of multimodal deep learning to classify skin lesions using both images and textual descriptions compared to learning only on images. Materials and Methods: We used the HAM10000 dataset in our study containing 10000 skin lesion images. We combined the images with patients data (sex, age, and lesion location) for training and evaluating a multimodal deep learning model. The dataset was split into 70% for training the model, 20% for the validation set, and 10% for the testing set. We compared the multi modal model's performance to well-known deep learning models that only use images for classification. Results: We used accuracy and area under the curve (AUC) receiver operating characteristic (ROC) as the metrics to compare the models performance. Our multimodal model achieved the best accuracy (94.11%) and AUCROC (0.9426). Conclusion: Our study showed that a multimodal deep learning model can outperform traditional deep learning models for skin lesion classification on the HAM10000 dataset. We believe our approach can enable primary care clinicians to screen for skin cancer in patients with higher accuracy and reliability.
What problem does this paper attempt to address?