MSNet: a novel network with comprehensive multi-scale feature integration for gastric cancer and colon polyp segmentation

Li, Yunqi
DOI: https://doi.org/10.1007/s11760-024-03594-3
IF: 1.583
2024-12-12
Signal Image and Video Processing
Abstract:The precise and reliable segmentation of endoscopic images plays a pivotal role in the early diagnosis of gastrointestinal cancer lesions like gastric cancer and colon polyp. Due to the variations in appearance and blurred boundaries of cancer lesions, achieving precise automatic segmentation remains challenging. Therefore, this paper proposes a novel CNN-Transformer hybrid network named MSNet, which comprehensively utilizes multi-scale features. It aims to improve the segmentation accuracy of gastrointestinal cancer lesions by enhancing the ability to capture multi-scale features and distinguish boundaries. Firstly, a multi-scale perception module (MSPM) is designed specifically to capture and enhance multi-scale features, employing multiple pathways with stripe convolutions of various kernel sizes. Secondly, to enhance the model's ability to identify target boundaries, an improved dual feature pyramid (IDFP) strategy is proposed to obtain multi-scale prediction maps from various pathways and stages, thereby aggregating multi-scale features. Thirdly, to refine the network's recognition of target boundaries, a Boundary Enhancement Module (BEM) is devised, which integrates the multi-scale prediction maps obtained from IDFP with the final output of the decoder. This integration intends to mitigate spatial information loss caused by consecutive downsampling and upsampling operations, thereby achieving more precise segmentation outcomes. Extensive experiments are conducted on the privately collected gastric endoscopy dataset and five publicly available colonoscopy polyp datasets, aiming to assess the effectiveness and generalization of the proposed method. The proposed method achieves 88.3% mDice, 93.6% mDice and 94.8% mDice on the gastroscopy dataset, Kvasir-SEG and CVC-ClinicDB respectively. The source code will be available at https://github.com/0xChunFeng/MSNet.
engineering, electrical & electronic,imaging science & photographic technology
What problem does this paper attempt to address?