Abstract:Background: The information between multimodal magnetic resonance imaging (MRI) is complementary. Combining multiple modalities for brain tumor image segmentation can improve segmentation accuracy, which has great significance for disease diagnosis and treatment. However, different degrees of missing modality data often occur in clinical practice, which may lead to serious performance degradation or even failure of brain tumor segmentation methods relying on full-modality sequences to complete the segmentation task. To solve the above problems, this study aimed to design a new deep learning network for incomplete multimodal brain tumor segmentation. Methods: We propose a novel cross-modal attention fusion-based deep neural network (CMAF-Net) for incomplete multimodal brain tumor segmentation, which is based on a three-dimensional (3D) U-Net architecture with encoding and decoding structure, a 3D Swin block, and a cross-modal attention fusion (CMAF) block. A convolutional encoder is initially used to extract the specific features from different modalities, and an effective 3D Swin block is constructed to model the long-range dependencies to obtain richer information for brain tumor segmentation. Then, a cross-attention based CMAF module is proposed that can deal with different missing modality situations by fusing features between different modalities to learn the shared representations of the tumor regions. Finally, the fused latent representation is decoded to obtain the final segmentation result. Additionally, channel attention module (CAM) and spatial attention module (SAM) are incorporated into the network to further improve the robustness of the model; the CAM to help focus on important feature channels, and the SAM to learn the importance of different spatial regions. Results: Evaluation experiments on the widely-used BraTS 2018 and BraTS 2020 datasets demonstrated the effectiveness of the proposed CMAF-Net which achieved average Dice scores of 87.9%, 81.8%, and 64.3%, as well as Hausdorff distances of 4.21, 5.35, and 4.02 for whole tumor, tumor core, and enhancing tumor on the BraTS 2020 dataset, respectively, outperforming several state-of-the-art segmentation methods in missing modalities situations. Conclusions: The experimental results show that the proposed CMAF-Net can achieve accurate brain tumor segmentation in the case of missing modalities with promising application potential.

MSAFusionNet: Multiple Subspace Attention Based Deep Multi-modal Fusion Network

MSAIF-Net: A Multistage Spatial Attention-Based Invertible Fusion Network for MR Images.

CASF-Net: Cross-attention and Cross-scale Fusion Network for Medical Image Segmentation

Sub-pixel multi-scale fusion network for medical image segmentation

MAFUNet: Multi-Attention Fusion Network for Medical Image Segmentation

A multibranch and multiscale neural network based on semantic perception for multimodal medical image fusion

MSA$^2$Net: Multi-scale Adaptive Attention-guided Network for Medical Image Segmentation

Medical Image Segmentation Based on Multi-Modal Convolutional Neural Network: Study on Image Fusion Schemes

A multi-attention and depthwise separable convolution network for medical image segmentation

MSAF: Multimodal Split Attention Fusion

Adaptive spatial and frequency experts fusion network for medical image fusion

Multi-Stage Fusion and Multi-Source Attention Network for Multi-Modal Remote Sensing Image Segmentation.

MFA-Net: Multiple Feature Association Network for Medical Image Segmentation

Learning Cross-Modal Deep Representations for Multi-Modal MR Image Segmentation

AFFSegNet: Adaptive Feature Fusion Segmentation Network for Microtumors and Multi-Organ Segmentation

Max-Fusion U-Net for Multi-Modal Pathology Segmentation with Attention and Dynamic Resampling

CMAF-Net: a cross-modal attention fusion-based deep neural network for incomplete multi-modal brain tumor segmentation

Deep Learning-Based Image Segmentation on Multimodal Medical Imaging

Multi-scale Feature Pyramid Fusion Network for Medical Image Segmentation

MFH‐Net: A Hybrid CNN‐Transformer Network Based Multi‐Scale Fusion for Medical Image Segmentation

MASDF-Net: A Multi-Attention Codec Network with Selective and Dynamic Fusion for Skin Lesion Segmentation