Abstract:Background: Use of artificial intelligence to identify dermoscopic images has brought major breakthroughs in recent years to the early diagnosis and early treatment of skin cancer, the incidence of which is increasing year by year worldwide and poses a great threat to human health. Achievements have been made in the research of skin cancer image classification by using the deep backbone of the convolutional neural network (CNN). This approach, however, only extracts the features of small objects in the image, and cannot locate the important parts. Objectives: As a result, researchers of the paper turn to vision transformers (VIT) which has demonstrated powerful performance in traditional classification tasks. The self-attention is to improve the value of important features and suppress the features that cause noise. Specifically, an improved transformer network named SkinTrans is proposed. Innovations: To verify its efficiency, a three step procedure is followed. Firstly, a VIT network is established to verify the effectiveness of SkinTrans in skin cancer classification. Then multi-scale and overlapping sliding windows are used to serialize the image and multi-scale patch embedding is carried out which pay more attention to multi-scale features. Finally, contrastive learning is used which makes the similar data of skin cancer encode similarly so that the encoding results of different data are as different as possible. Main results: The experiment is carried out based on two datasets, namely (1) HAM10000: a large dataset of multi-source dermatoscopic images of common skin cancers; (2)A clinical dataset of skin cancer collected by dermoscopy. The model proposed has achieved 94.3% accuracy on HAM10000 and 94.1% accuracy on our datasets, which verifies the efficiency of SkinTrans. Conclusions: The transformer network has not only achieved good results in natural language but also achieved ideal results in the field of vision, which also lays a good foundation for skin cancer classification based on multimodal data. This paper is convinced that it will be of interest to dermatologists, clinical researchers, computer scientists and researchers in other related fields, and provide greater convenience for patients.

Classification of Mobile-Based Oral Cancer Images Using the Vision Transformer and the Swin Transformer

Intelligent deep learning supports biomedical image detection and classification of oral cancer

Efficient Lung Cancer Image Classification and Segmentation Algorithm Based on an Improved Swin Transformer

Detection of oral squamous cell carcinoma in clinical photographs using a vision transformer

Efficient Lung Cancer Image Classification and Segmentation Algorithm Based on Improved Swin Transformer

Vision transformer-convolution for breast cancer classification using mammography images: A comparative study

Skin Cancer Segmentation and Classification Using Vision Transformer for Automatic Analysis in Dermatoscopy-Based Noninvasive Digital System

Early Cervical Cancer Diagnosis with SWIN-Transformer and Convolutional Neural Networks

An improved transformer network for skin cancer classification

Skin Cancer Classification Based on Convolutional Neural Networks and Vision Transformers

Leveraging Pretrained Vision Transformers for Automated Cancer Diagnosis in Optical Coherence Tomography Images

Skin Cancer Detection utilizing Deep Learning: Classification of Skin Lesion Images using a Vision Transformer

Automated Lung and Colon Cancer Classification Using Histopathological Images

Vision Transformer for Efficient Chest X-ray and Gastrointestinal Image Classification

Advancing brain tumor detection: harnessing the Swin Transformer’s power for accurate classification and performance analysis

A Novel Vision Transformer Model for Skin Cancer Classification

Skin Cancer Segmentation and Classification Using Vision Transformer for Automatic Analysis in Dermatoscopy-based Non-invasive Digital System

EP093/#664 Automatic multimodal classification using transformer for cervical cancer

SwinCNN: An Integrated Swin Transformer and CNN for Improved Breast Cancer Grade Classification

Pathological Insights: Enhanced Vision Transformers for the Early Detection of Colorectal Cancer

Application of Vision-Series Transformer in Screening for Coronary Heart Diseases Using Coronary CT Angiography.