Abstract:OBJECTIVE: Transformers, born to remedy the inadequate receptive fields of CNNs, have drawn explosive attention recently. However, the daunting computational complexity of global representation learning, together with rigid window partitioning, hinders their deployment in medical image segmentation. This work aims to address the above two issues in transformers for better medical image segmentation.METHODS: We propose a boundary-aware lightweight transformer (BATFormer) that can build cross-scale global interaction with lower computational complexity and generate windows flexibly under the guidance of entropy. Specifically, to fully explore the benefits of transformers in long-range dependency establishment, a cross-scale global transformer (CGT) module is introduced to jointly utilize multiple small-scale feature maps for richer global features with lower computational complexity. Given the importance of shape modeling in medical image segmentation, a boundary-aware local transformer (BLT) module is constructed. Different from rigid window partitioning in vanilla transformers which would produce boundary distortion, BLT adopts an adaptive window partitioning scheme under the guidance of entropy for both computational complexity reduction and shape preservation.RESULTS: BATFormer achieves the best performance in Dice of 92.84 %, 91.97 %, 90.26 %, and 96.30 % for the average, right ventricle, myocardium, and left ventricle respectively on the ACDC dataset and the best performance in Dice, IoU, and ACC of 90.76 %, 84.64 %, and 96.76 % respectively on the ISIC 2018 dataset. More importantly, BATFormer requires the least amount of model parameters and the lowest computational complexity compared to the state-of-the-art approaches.CONCLUSION AND SIGNIFICANCE: Our results demonstrate the necessity of developing customized transformers for efficient and better medical image segmentation. We believe the design of BATFormer is inspiring and extendable to other applications/frameworks. The source code is publicly available at https://github.com/xianlin7/BATFormer.

ConvFormer: Plug-and-Play CNN-Style Transformers for Improving Medical Image Segmentation

MixFormer: a Mixed CNN-Transformer Backbone for Medical Image Segmentation

ScaleFormer: Revisiting the Transformer-based Backbones from a Scale-wise Perspective for Medical Image Segmentation.

MAXFormer: Enhanced Transformer for Medical Image Segmentation with Multi-Attention and Multi-Scale Features Fusion

ConvFormer: Combining CNN and Transformer for Medical Image Segmentation

MISSFormer: An Effective Medical Image Segmentation Transformer

CiT-Net: Convolutional Neural Networks Hand in Hand with Vision Transformers for Medical Image Segmentation

DTMFormer: Dynamic Token Merging for Boosting Transformer-Based Medical Image Segmentation

Nnformer: Volumetric Medical Image Segmentation Via a 3D Transformer

STA-Former: enhancing medical image segmentation with Shrinkage Triplet Attention in a hybrid CNN-Transformer model

H2Former: An Efficient Hierarchical Hybrid Transformer for Medical Image Segmentation

TEC-Net: Vision Transformer Embrace Convolutional Neural Networks for Medical Image Segmentation

D-former: a U-shaped Dilated Transformer for 3D medical image segmentation

BATFormer: Towards Boundary-Aware Lightweight Transformer for Efficient Medical Image Segmentation

CTC-Net: A Novel Coupled Feature-Enhanced Transformer and Inverted Convolution Network for Medical Image Segmentation

MultiTrans: Multi-branch transformer network for medical image segmentation

3D Medical image segmentation using parallel transformers

MISSFormer: an Effective Transformer for 2D Medical Image Segmentation

HD-Former: A hierarchical dependency Transformer for medical image segmentation

UCTNet: Uncertainty-guided CNN-Transformer hybrid networks for medical image segmentation