Abstract:Gastrointestinal (GI) cancer is a malignancy affecting the digestive organs. During radiation therapy, the radiation oncologist must precisely aim the X-ray beam at the tumor while avoiding unaffected areas of the stomach and intestines. Consequently, accurate, automated GI image segmentation is urgently needed in clinical practice. While the fully convolutional network (FCN) and U-Net framework have shown impressive results in medical image segmentation, their ability to model long-range dependencies is constrained by the convolutional kernel's restricted receptive field. The transformer has a robust capacity for global modeling owing to its inherent global self-attention mechanism. The TransUnet model leverages the strengths of both the convolutional neural network (CNN) and transformer models through a hybrid CNN-transformer encoder. However, the concatenation of high- and low-level features in the decoder is ineffective in fusing global and local information. To overcome this limitation, we propose an innovative transformer-based medical image segmentation architecture called BiFTransNet, which introduces a BiFusion module into the decoder stage, enabling effective global and local feature fusion by enabling feature integration from various modules. Further, a multilevel loss (ML) strategy is introduced to oversee the learning process of each decoder layer and optimize the use of globally and locally fused contextual features at different scales. Our method achieved a Dice score of 89.51% and an intersection-over-union (IoU) score of 86.54% on the UW-Madison Gastrointestinal Segmentation dataset. Moreover, our method attained a Dice score of 78.77% and a Hausdorff distance (HD) of 27.94% on the Synapse Multi-organ Segmentation dataset. Compared with the state-of-the-art methods, our proposed method achieves superior segmentation performance in gastrointestinal segmentation tasks. More significantly, our method can be easily extended to medical segmentation in different modalities such as CT and MRI. Our method achieves clinical multimodal medical segmentation and provides decision supports for clinical radiotherapy plans.

FocalUNETR: A Focal Transformer for Boundary-aware Segmentation of CT Images

CCT-Unet: A U-Shaped Network Based on Convolution Coupled Transformer for Segmentation of Peripheral and Transition Zones in Prostate MRI.

FCTformer: Fusing Convolutional Operations and Transformer for 3D Rectal Tumor Segmentation in MR Images

BiFTransNet: A unified and simultaneous segmentation network for gastrointestinal images of CT & MRI

CT Male Pelvic Organ Segmentation Using Fully Convolutional Networks with Boundary Sensitive Representation

TripletUNet: Multi-Task U-Net with Online Voxel-Wise Learning for Precise CT Prostate Segmentation.

Automatic Segmentation of the Prostate on CT Images Using Deep Learning and Multi-Atlas Fusion

HTC-Net: A hybrid CNN-transformer framework for medical image segmentation

CT-Net: Asymmetric compound branch Transformer for medical image segmentation

CAFCT-Net: A CNN-Transformer Hybrid Network with Contextual and Attentional Feature Fusion for Liver Tumor Segmentation

TFCNs: A CNN-Transformer Hybrid Network for Medical Image Segmentation

Focal-UNet: UNet-like Focal Modulation for Medical Image Segmentation

CTC-Net: A Novel Coupled Feature-Enhanced Transformer and Inverted Convolution Network for Medical Image Segmentation

OdCFU-Net for Brain Tumor Segmentation Based on MRI

Asymmetric multi-task attention network for prostate bed segmentation in computed tomography images

Learning Distance Transform for Boundary Detection and Deformable Segmentation in CT Prostate Images

TSUBF-Net: Trans-Spatial UNet-like Network with Bi-direction Fusion for Segmentation of Adenoid Hypertrophy in CT

CFANet: Context fusing attentional network for preoperative CT image segmentation in robotic surgery

Hybrid-ctunet: a double complementation approach for 3D medical image segmentation

Focus-TransUnet3D: High-precision Model for 3D Segmentation of Medical Point Targets

FcTC-UNet: Fine-grained Combination of Transformer and CNN for Thoracic Organs Segmentation