Dual triple attention guided CNN-VMamba for medical image segmentation

Li, Jing
DOI: https://doi.org/10.1007/s00530-024-01498-3
IF: 3.9
2024-09-21
Multimedia Systems
Abstract:Medical image segmentation plays a vital role in assisting doctors to quickly and accurately identify pathological areas in medical images, which is essential for effective diagnosis and treatment planning. Although current architectures built by integrating CNNs and Transformers have achieved impressive results, they are still constrained by the limitations of fusion methods and the computational complexity of Transformers. To address these restrictions, we introduce a dual triple attention module designed to encourage selective modeling of image features, thereby enhancing the overall performance of the segmentation process. We design the feature extraction module based on CNN and VMamba for both local and global feature extraction. The CNN part includes regular convolutions and dilated convolutions, and we use the Visual State Space Block from VMamba instead of Transformers to reduce computational complexity. The entire network is connected through new skip connections. Experimental results show that the proposed model achieved DSC scores of 92.34% and 82.16% on the ACDC and Synapse datasets, respectively. Meanwhile, the FLOPs and parameters are also superior to other state-of-the-art methods.
computer science, information systems, theory & methods
What problem does this paper attempt to address?