Abstract:Gastrointestinal (GI) cancer is a malignancy affecting the digestive organs. During radiation therapy, the radiation oncologist must precisely aim the X-ray beam at the tumor while avoiding unaffected areas of the stomach and intestines. Consequently, accurate, automated GI image segmentation is urgently needed in clinical practice. While the fully convolutional network (FCN) and U-Net framework have shown impressive results in medical image segmentation, their ability to model long-range dependencies is constrained by the convolutional kernel's restricted receptive field. The transformer has a robust capacity for global modeling owing to its inherent global self-attention mechanism. The TransUnet model leverages the strengths of both the convolutional neural network (CNN) and transformer models through a hybrid CNN-transformer encoder. However, the concatenation of high- and low-level features in the decoder is ineffective in fusing global and local information. To overcome this limitation, we propose an innovative transformer-based medical image segmentation architecture called BiFTransNet, which introduces a BiFusion module into the decoder stage, enabling effective global and local feature fusion by enabling feature integration from various modules. Further, a multilevel loss (ML) strategy is introduced to oversee the learning process of each decoder layer and optimize the use of globally and locally fused contextual features at different scales. Our method achieved a Dice score of 89.51% and an intersection-over-union (IoU) score of 86.54% on the UW-Madison Gastrointestinal Segmentation dataset. Moreover, our method attained a Dice score of 78.77% and a Hausdorff distance (HD) of 27.94% on the Synapse Multi-organ Segmentation dataset. Compared with the state-of-the-art methods, our proposed method achieves superior segmentation performance in gastrointestinal segmentation tasks. More significantly, our method can be easily extended to medical segmentation in different modalities such as CT and MRI. Our method achieves clinical multimodal medical segmentation and provides decision supports for clinical radiotherapy plans.

TransCUNet: UNet cross fused transformer for medical image segmentation

FCTrans UNet: A Hybrid CNN and Transformer Model for Medical Image Segmentations

E-Transunet: Enhanced Transunet for Medical Image Segmentation

Multiscale Transunet + + : Dense Hybrid U-Net with Transformer for Medical Image Segmentation

P-TransUNet: an improved parallel network for medical image segmentation

TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

Medical Image Segmentation Using Dual Branch Networks with Embedded Attention Mechanism.

EG-TransUNet: a transformer-based U-Net with enhanced and guided models for biomedical image segmentation

MTC-TransUNet: A Multi-Scale Mixed Convolution TransUNet for Medical Image Segmentation

BiFTransNet: A unified and simultaneous segmentation network for gastrointestinal images of CT & MRI

TransUNet: Rethinking the U-Net architecture design for medical image segmentation through the lens of transformers

3D TransUNet: Advancing Medical Image Segmentation through Vision Transformers

HCT-Unet: multi-target medical image segmentation via a hybrid CNN-transformer Unet incorporating multi-axis gated multi-layer perceptron

MSCT-UNET: multi-scale contrastive transformer within U-shaped network for medical image segmentation

Isc-Transunet: Medical Image Segmentation Network Based On The Integration Of Self-Attention And Convolution

A novel full-convolution UNet-transformer for medical image segmentation

Sfe-Transunet: A Transformer-Based U-Net With Skipped Features Enhancer For Medical Image Segmentation

CCT-Unet: A U-Shaped Network Based on Convolution Coupled Transformer for Segmentation of Peripheral and Transition Zones in Prostate MRI.

GCtx-UNet: Efficient Network for Medical Image Segmentation

Cross Pyramid Transformer makes U-net stronger in medical image segmentation

Trans-UNeter: A new Decoder of TransUNet for Medical Image Segmentation.