EM-Net: Efficient Channel and Frequency Learning with Mamba for 3D Medical Image Segmentation

Ao Chang,Jiajun Zeng,Ruobing Huang,Dong Ni

2024-09-26

Abstract:Convolutional neural networks have primarily led 3D medical image segmentation but may be limited by small receptive fields. Transformer models excel in capturing global relationships through self-attention but are challenged by high computational costs at high resolutions. Recently, Mamba, a state space model, has emerged as an effective approach for sequential modeling. Inspired by its success, we introduce a novel Mamba-based 3D medical image segmentation model called EM-Net. It not only efficiently captures attentive interaction between regions by integrating and selecting channels, but also effectively utilizes frequency domain to harmonize the learning of features across varying scales, while accelerating training speed. Comprehensive experiments on two challenging multi-organ datasets with other state-of-the-art (SOTA) algorithms show that our method exhibits better segmentation accuracy while requiring nearly half the parameter size of SOTA models and 2x faster training speed.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper aims to address several key issues in 3D medical image segmentation: 1. **Limited Receptive Field**: Traditional Convolutional Neural Networks (CNNs) perform well in 3D medical image segmentation but are limited by a small receptive field, making it difficult to capture global information. 2. **High Computational Cost**: Transformer models can capture global relationships through self-attention mechanisms, but they have high computational costs on high-resolution images, leading to inefficiency. 3. **Spatial Relationship Modeling**: Sequence models struggle to effectively model spatial relationships when processing 3D images, often relying on strong positional encoding, which is limited in complex spatial dependencies. 4. **Memory Consumption**: Medical image segmentation tasks often require substantial memory support, especially when hardware resources are limited. To address these issues, the authors propose a novel 3D medical image segmentation model based on Mamba—EM-Net. The main contributions of EM-Net include: 1. **Channel Squeeze and Excitation Mamba (CSRM) Module**: Effectively captures relevant patterns in target regions through channel selection and adaptive calibration. 2. **Efficient Frequency Domain Learning (EFL) Layer**: Utilizes Fast Fourier Transform (FFT) to achieve learnable frequency weighting, balancing the learning of features at different scales. 3. **Mamba-Enhanced Decoder**: Further improves segmentation performance while reducing memory consumption. Experimental results show that EM-Net performs excellently on two challenging multi-organ datasets, achieving higher segmentation accuracy with only half the parameters of existing state-of-the-art models and doubling the training speed.

EM-Net: Efficient Channel and Frequency Learning with Mamba for 3D Medical Image Segmentation

MambaClinix: Hierarchical Gated Convolution and Mamba-Based U-Net for Enhanced 3D Medical Image Segmentation

Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation

U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation

Taming Mambas for Voxel Level 3D Medical Image Segmentation

SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation

LPAM: A lightweight medical segmentation network based on Mamba improved by prompt attention

nnMamba: 3D Biomedical Image Segmentation, Classification and Landmark Detection with State Space Model

Large Window-based Mamba UNet for Medical Image Segmentation: Beyond Convolution and Self-attention

T-Mamba: Frequency-Enhanced Gated Long-Range Dependency for Tooth 3D CBCT Segmentation

Tri-Plane Mamba: Efficiently Adapting Segment Anything Model for 3D Medical Images

MedSegMamba: 3D CNN-Mamba Hybrid Architecture for Brain Segmentation

HMT-UNet: A hybird Mamba-Transformer Vision UNet for Medical Image Segmentation

VM-UNET-V2 Rethinking Vision Mamba UNet for Medical Image Segmentation

Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining

Dual triple attention guided CNN-VMamba for medical image segmentation

Rotate to Scan: UNet-like Mamba with Triplet SSM Module for Medical Image Segmentation

LKM-UNet: Large Kernel Vision Mamba UNet for Medical Image Segmentation