Abstract:OBJECTIVE: Transformers, born to remedy the inadequate receptive fields of CNNs, have drawn explosive attention recently. However, the daunting computational complexity of global representation learning, together with rigid window partitioning, hinders their deployment in medical image segmentation. This work aims to address the above two issues in transformers for better medical image segmentation.METHODS: We propose a boundary-aware lightweight transformer (BATFormer) that can build cross-scale global interaction with lower computational complexity and generate windows flexibly under the guidance of entropy. Specifically, to fully explore the benefits of transformers in long-range dependency establishment, a cross-scale global transformer (CGT) module is introduced to jointly utilize multiple small-scale feature maps for richer global features with lower computational complexity. Given the importance of shape modeling in medical image segmentation, a boundary-aware local transformer (BLT) module is constructed. Different from rigid window partitioning in vanilla transformers which would produce boundary distortion, BLT adopts an adaptive window partitioning scheme under the guidance of entropy for both computational complexity reduction and shape preservation.RESULTS: BATFormer achieves the best performance in Dice of 92.84 %, 91.97 %, 90.26 %, and 96.30 % for the average, right ventricle, myocardium, and left ventricle respectively on the ACDC dataset and the best performance in Dice, IoU, and ACC of 90.76 %, 84.64 %, and 96.76 % respectively on the ISIC 2018 dataset. More importantly, BATFormer requires the least amount of model parameters and the lowest computational complexity compared to the state-of-the-art approaches.CONCLUSION AND SIGNIFICANCE: Our results demonstrate the necessity of developing customized transformers for efficient and better medical image segmentation. We believe the design of BATFormer is inspiring and extendable to other applications/frameworks. The source code is publicly available at https://github.com/xianlin7/BATFormer.

DTMFormer: Dynamic Token Merging for Boosting Transformer-Based Medical Image Segmentation

MixFormer: a Mixed CNN-Transformer Backbone for Medical Image Segmentation

Mixed Transformer U-Net for Medical Image Segmentation

Mmformer: Multimodal Medical Transformer for Incomplete Multimodal Learning of Brain Tumor Segmentation

ConvFormer: Plug-and-Play CNN-Style Transformers for Improving Medical Image Segmentation

HD-Former: A hierarchical dependency Transformer for medical image segmentation

DAE-Former: Dual Attention-guided Efficient Transformer for Medical Image Segmentation

BATFormer: Towards Boundary-Aware Lightweight Transformer for Efficient Medical Image Segmentation

STA-Former: enhancing medical image segmentation with Shrinkage Triplet Attention in a hybrid CNN-Transformer model

TMFormer: Token Merging Transformer for Brain Tumor Segmentation with Missing Modalities

H2Former: An Efficient Hierarchical Hybrid Transformer for Medical Image Segmentation

ClassFormer: Exploring Class-Aware Dependency with Transformer for Medical Image Segmentation

TEC-Net: Vision Transformer Embrace Convolutional Neural Networks for Medical Image Segmentation

D-former: a U-shaped Dilated Transformer for 3D medical image segmentation

HTC-Net: A hybrid CNN-transformer framework for medical image segmentation

Localization of hemoptysis in patients with cystic fibrosis.

SegFormer3D: an Efficient Transformer for 3D Medical Image Segmentation

TransDAE: Dual Attention Mechanism in a Hierarchical Transformer for Efficient Medical Image Segmentation

Multi-dimension Transformer with Attention-based Filtering for Medical Image Segmentation

MCRformer: Morphological constraint reticular transformer for 3D medical image segmentation