Abstract:Automatic segmentation and classification of lesions are two clinically significant tasks in the computer-aided diagnosis of skin diseases. Both tasks are challenging due to the nonnegligible lesion differences in dermoscopic images from different patients. In this paper, we propose a novel pipeline to efficiently implement skin lesions' segmentation and classification tasks, which consists of a segmentation network and a classification network. To improve the performance of the segmentation network, we propose a novel module of Multi -Scale Holistic Feature Exploration (MSH) to thoroughly exploit perceptual clues latent among multi-scale feature maps as synthesized by the decoder. The MSH module enables holistic exploration of features across multiple scales to more effectively support downstream image analysis tasks. To boost the performance of the classification network, we propose a novel module of Cross-Modality Collaborative Feature Exploration (CMC) to discover latent discriminative features by collaboratively exploiting potential relationships between cross -modal features of dermoscopic images and clinical metadata. The CMC module enables dynamically capturing versatile interaction effects among cross-modal features during the model's representation learning procedure by discriminatively and adaptively learning the interaction weight associated with each crossmodality feature pair. In addition, to effectively reduce background noise and boost the lesion discrimination ability of the classification network, we crop the images based on lesion masks generated by the best segmentation model. We evaluate the proposed pipeline on the four public skin lesion datasets, where the ISIC 2018 and PH2 are for segmentation, and the ISIC 2019 and ISIC 2020 are combined into a new dataset, ISIC 2019&2020, for classification. It achieves a Jaccard index of 83.31% and 90.14% in skin lesion segmentation, an AUC of 97.98% and an Accuracy of 92.63% in skin lesion classification, which is superior to the performance of representative state-of-the-art skin lesion segmentation and classification methods. Last but not least, the new model for segmentation utilizes much fewer model parameters (3.3 M) than its peer approaches, leading to a greatly reduced number of labeled samples required for model training, which obtains substantially stronger robustness than its peers.

MT-TransUNet: Mediating Multi-Task Tokens in Transformers for Skin Lesion Segmentation and Classification

Mixed Transformer U-Net for Medical Image Segmentation

SUTrans-NET: a hybrid transformer approach to skin lesion segmentation

Fully Transformer Network for Skin Lesion Analysis

MobileUNETR: A Lightweight End-To-End Hybrid Vision Transformer For Efficient Medical Image Segmentation

MSS-UNet: A Multi-Spatial-Shift MLP-based UNet for skin lesion segmentation

MTANet: Multi-Task Attention Network for Automatic Medical Image Segmentation and Classification

HMT-Net: Transformer and MLP Hybrid Encoder for Skin Disease Segmentation

RMMLP:Rolling MLP and matrix decomposition for skin lesion segmentation

MT-SCnet: multi-scale token divided and spatial-channel fusion transformer network for microscopic hyperspectral image segmentation

Joint segmentation and classification of skin lesions via a multi-task learning convolutional neural network

A Multi-Stage Framework With Context Information Fusion Structure For Skin Lesion Segmentation

CTCNet: A Bi-directional Cascaded Segmentation Network Combining Transformers with CNNs for Skin Lesions.

LTUNet: A Lightweight Transformer-Based UNet with Multi-scale Mechanism for Skin Lesion Segmentation.

EGE-UNet: an Efficient Group Enhanced UNet for skin lesion segmentation

Multiscale Transunet + + : Dense Hybrid U-Net with Transformer for Medical Image Segmentation

When Mamba Meets xLSTM: An Efficient and Precise Method with the XLSTM-VMUNet Model for Skin lesion Segmentation

TESL-Net: A Transformer-Enhanced CNN for Accurate Skin Lesion Segmentation

MSCT-UNET: multi-scale contrastive transformer within U-shaped network for medical image segmentation

A dual-stage transformer and MLP-based network for breast ultrasound image segmentation

Learning from Dermoscopic Images in Association with Clinical Metadata for Skin Lesion Segmentation and Classification