Abstract:Accurate segmentation is crucial in diagnosing and analyzing skin lesions. However, automatic segmentation of skin lesions is extremely challenging because of their variable sizes, uneven color distributions, irregular shapes, hair occlusions, and blurred boundaries. Owing to the limited range of convolutional networks receptive fields, shallow convolution cannot extract the global features of images and thus has limited segmentation performance. Because medical image datasets are small in scale, the use of excessively deep networks could cause overfitting and increase computational complexity. Although transformer networks can focus on extracting global information, they cannot extract sufficient local information and accurately segment detailed lesion features. In this study, we designed a dual-branch encoder that combines a convolution neural network (CNN) and a transformer. The CNN branch of the encoder comprises four layers, which learn the local features of images through layer-wise downsampling. The transformer branch also comprises four layers, enabling the learning of global image information through attention mechanisms. The feature fusion module in the network integrates local features and global information, emphasizes important channel features through the channel attention mechanism, and filters irrelevant feature expressions. The information exchange between the decoder and encoder is finally achieved through skip connections to supplement the information lost during the sampling process, thereby enhancing segmentation accuracy. The data used in this paper are from four public datasets, including images of melanoma, basal cell tumor, fibroma, and benign nevus. Because of the limited size of the image data, we enhanced them using methods such as random horizontal flipping, random vertical flipping, random brightness enhancement, random contrast enhancement, and rotation. The segmentation accuracy is evaluated through intersection over union and duration, integrity, commitment, and effort indicators, reaching 87.7 % and 93.21 %, 82.05 % and 89.19 %, 86.81 % and 92.72 %, and 92.79 % and 96.21 %, respectively, on the ISIC 2016, ISIC 2017, ISIC 2018, and PH2 datasets, respectively (code: https://github.com/hyjane/CCT-Net).

Hybrid transformer for lesion segmentation on adaptive optics retinal images

SatFormer: Saliency-Guided Abnormality-Aware Transformer for Retinal Disease Classification in Fundus Image

Towards more efficient ophthalmic disease classification and lesion location via convolution transformer

A Hybrid Enhanced Attention Transformer Network for Medical Ultrasound Image Segmentation

Slimmable transformer with hybrid axial-attention for medical image segmentation

Lesion-Aware Transformers for Diabetic Retinopathy Grading

Concept-based Lesion Aware Transformer for Interpretable Retinal Disease Diagnosis

Enhancing skin lesion segmentation with a fusion of convolutional neural networks and transformer models

Segmentation of stroke lesions using transformers-augmented MRI analysis

Hybrid CNN-Transformer Network with Circular Feature Interaction for Acute Ischemic Stroke Lesion Segmentation on Non-contrast CT Scans

Axial Attention Transformer Networks: A New Frontier in Breast Cancer Detection

SUTrans-NET: a hybrid transformer approach to skin lesion segmentation

Cervical Lesion Segmentation Via Transformer-Based Network

Prior-guided attention fusion transformer for multi-lesion segmentation of diabetic retinopathy

A LLM-Based Hybrid-Transformer Diagnosis System in Healthcare

Transformer and convolutional based dual branch network for retinal vessel segmentation in OCTA images

A Multi-Branch Hybrid Transformer Networkfor Corneal Endothelial Cell Segmentation

HTC-retina: A hybrid retinal diseases classification model using transformer-Convolutional Neural Network from optical coherence tomography images

Large-kernel Attention for Efficient and Robust Brain Lesion Segmentation

Cervical Lesion Segmentation Via Transformer-Based Network with Attention and Boundary-Aware Modules

TCU-Net: Transformer Embedded in Convolutional U-Shaped Network for Retinal Vessel Segmentation