Skin Cancer Classification Based on Convolutional Neural Networks and Vision Transformers

Zhenhao Zhao
DOI: https://doi.org/10.1088/1742-6596/2405/1/012037
2022-12-01
Journal of Physics: Conference Series
Abstract:Abstract Skin cancer is one of the most prevalent illnesses in the globe, and early diagnosis is the most effective method for preventing the disease and reducing mortality. Nowadays, the most prevalent way of detecting skin cancer is a visual diagnosis by specialists using dermoscopy images. Nevertheless, the similarity of the appearance between skin cancer lesions and the complexity of dermoscopic images pose a great challenge for the detection and classification of skin cancer. To solve these problems, this paper compares the current start-of-the-art deep learning methods, Convolutional Neural Networks (CNN), and transformers, to verify which method is more suitable for automatic skin cancer lesion classification. First, we assign different weights to individual lesions to address the problem of imbalance of the dataset. Secondly, we crop the images in the dataset and utilize the method of data augmentation to enhance the sample size. Thirdly, we select and construct the corresponding methods of CNN and transformers. The methods selected for CNN are VGGNet and ResNet. The methods selected for transformers are Vision Transformers (ViT) and DeepViT. Finally, we analyze these methods in terms of the loss, accuracy, and confusion matrix on the related HAM10000 dataset. The experimental results demonstrate that both CNN methods and transformers methods can achieve good performance on the skin cancer lesion classification task, but the CNN methods perform better than the transformers methods.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenge of skin cancer lesion classification. Specifically, skin cancer is one of the most common cancers globally, and early diagnosis is crucial for disease prevention and mortality reduction. However, due to the similarity in appearance between skin cancer lesions and normal skin or other non - malignant lesions, as well as the complexity of dermoscopic images, skin cancer detection and classification face great challenges. To address these challenges, this paper compares the current state - of - the - art deep - learning methods - Convolutional Neural Networks (CNN) and Vision Transformers - to verify which method is more suitable for automatic skin cancer lesion classification. The problem of dataset imbalance is solved by assigning different weights to each lesion, the sample size is increased using data augmentation techniques, and corresponding CNN and Transformer models are constructed for experiments. Finally, the losses, accuracies, and confusion matrices of these models on relevant datasets (such as the HAM10000 dataset) are analyzed to evaluate their performance. The experimental results show that although both methods can achieve good performance in the skin cancer lesion classification task, the CNN method outperforms the Transformer method.