Abstract:Background: Precise glioma segmentation from multi-parametric magnetic resonance (MR) images is essential for brain glioma diagnosis. However, due to the indistinct boundaries between tumor sub-regions and the heterogeneous appearances of gliomas in volumetric MR scans, designing a reliable and automated glioma segmentation method is still challenging. Although existing 3D Transformer-based or convolution-based segmentation networks have obtained promising results via multi-modal feature fusion strategies or contextual learning methods, they widely lack the capability of hierarchical interactions between different modalities and cannot effectively learn comprehensive feature representations related to all glioma sub-regions. Purpose: To overcome these problems, in this paper, we propose a 3D hierarchical cross-modality interaction network (HCMINet) using Transformers and convolutions for accurate multi-modal glioma segmentation, which leverages an effective hierarchical cross-modality interaction strategy to sufficiently learn modality-specific and modality-shared knowledge correlated to glioma sub-region segmentation from multi-parametric MR images. Methods: In the HCMINet, we first design a hierarchical cross-modality interaction Transformer (HCMITrans) encoder to hierarchically encode and fuse heterogeneous multi-modal features by Transformer-based intra-modal embeddings and inter-modal interactions in multiple encoding stages, which effectively captures complex cross-modality correlations while modeling global contexts. Then, we collaborate an HCMITrans encoder with a modality-shared convolutional encoder to construct the dual-encoder architecture in the encoding stage, which can learn the abundant contextual information from global and local perspectives. Finally, in the decoding stage, we present a progressive hybrid context fusion (PHCF) decoder to progressively fuse local and global features extracted by the dual-encoder architecture, which utilizes the local-global context fusion (LGCF) module to efficiently alleviate the contextual discrepancy among the decoding features. Results: Extensive experiments are conducted on two public and competitive glioma benchmark datasets, including the BraTS2020 dataset with 494 patients and the BraTS2021 dataset with 1251 patients. Results show that our proposed method outperforms existing Transformer-based and CNN-based methods using other multi-modal fusion strategies in our experiments. Specifically, the proposed HCMINet achieves state-of-the-art mean DSC values of 85.33% and 91.09% on the BraTS2020 online validation dataset and the BraTS2021 local testing dataset, respectively. Conclusions: Our proposed method can accurately and automatically segment glioma regions from multi-parametric MR images, which is beneficial for the quantitative analysis of brain gliomas and helpful for reducing the annotation burden of neuroradiologists.

An efficient R-Transformer network with dual encoders for brain glioma segmentation in MR images

TransResUNet: Revolutionizing Glioma Brain Tumor Segmentation Through Transformer-Enhanced Residual UNet

Glioblastoma Tumor Segmentation using an Ensemble of Vision Transformers

ETUNet:Exploring efficient transformer enhanced UNet for 3D brain tumor segmentation

Augmented Transformer network for MRI brain tumor segmentation

RMTF-Net: Residual Mix Transformer Fusion Net for 2D Brain Tumor Segmentation

DTASUnet: a local and global dual transformer with the attention supervision U-network for brain tumor segmentation

Brain tumor segmentation based on the U-NET+⁣+ network with efficientnet encoder

ResMT: A Hybrid CNN-transformer Framework for Glioma Grading with 3D MRI

EoFormer: Edge-Oriented Transformer for Brain Tumor Segmentation

3D Brainformer: 3D Fusion Transformer for Brain Tumor Segmentation

Treatment of low back pain and sciatica with extradural analgesia and steroid injection 1971-1978.

TransSea: Hybrid CNN-Transformer with Semantic Awareness for 3D Brain Tumor Segmentation

ERV-Net: An efficient 3D residual neural network for brain tumor segmentation

TransSea: Hybrid CNN–Transformer With Semantic Awareness for 3-D Brain Tumor Segmentation

RFTNet: Region–Attention Fusion Network Combined with Dual-Branch Vision Transformer for Multimodal Brain Tumor Image Segmentation

ERU-Net: A novel effective 2D residual neural network for brain tumors semantic segmentation from multimodal MRI

Ensemble Learning with Residual Transformer for Brain Tumor Segmentation

Efficient U-Net Architecture with Multiple Encoders and Attention Mechanism Decoders for Brain Tumor Segmentation

MRI tumor segmentation with densely connected 3D CNN

A 3D hierarchical cross-modality interaction network using transformers and convolutions for brain glioma segmentation in MR images