VM-UNet: Vision Mamba UNet for Medical Image Segmentation

Jiacheng Ruan,Suncheng Xiang
2024-02-04
Abstract:In the realm of medical image segmentation, both CNN-based and Transformer-based models have been extensively explored. However, CNNs exhibit limitations in long-range modeling capabilities, whereas Transformers are hampered by their quadratic computational complexity. Recently, State Space Models (SSMs), exemplified by Mamba, have emerged as a promising approach. They not only excel in modeling long-range interactions but also maintain a linear computational complexity. In this paper, leveraging state space models, we propose a U-shape architecture model for medical image segmentation, named Vision Mamba UNet (VM-UNet). Specifically, the Visual State Space (VSS) block is introduced as the foundation block to capture extensive contextual information, and an asymmetrical encoder-decoder structure is constructed. We conduct comprehensive experiments on the ISIC17, ISIC18, and Synapse datasets, and the results indicate that VM-UNet performs competitively in medical image segmentation tasks. To our best knowledge, this is the first medical image segmentation model constructed based on the pure SSM-based model. We aim to establish a baseline and provide valuable insights for the future development of more efficient and effective SSM-based segmentation systems. Our code is available at
Image and Video Processing,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper aims to address the issues in medical image segmentation, particularly the limitations of existing Convolutional Neural Networks (CNN) and Transformer models in long-range information modeling and computational complexity. The paper proposes a new architecture called VM-UNet, based on a Pure State Space Model (SSM), which effectively captures long-range dependency relationships while maintaining linear computational complexity. Through experiments on multiple medical image segmentation datasets, VM-UNet demonstrates performance competitive with existing methods, providing a foundation for future more efficient and effective SSM-based segmentation models.