ASPS: Augmented Segment Anything Model for Polyp Segmentation

Huiqian Li,Dingwen Zhang,Jieru Yao,Longfei Han,Zhongyu Li,Junwei Han
2024-06-30
Abstract:Polyp segmentation plays a pivotal role in colorectal cancer diagnosis. Recently, the emergence of the Segment Anything Model (SAM) has introduced unprecedented potential for polyp segmentation, leveraging its powerful pre-training capability on large-scale datasets. However, due to the domain gap between natural and endoscopy images, SAM encounters two limitations in achieving effective performance in polyp segmentation. Firstly, its Transformer-based structure prioritizes global and low-frequency information, potentially overlooking local details, and introducing bias into the learned features. Secondly, when applied to endoscopy images, its poor out-of-distribution (OOD) performance results in substandard predictions and biased confidence output. To tackle these challenges, we introduce a novel approach named Augmented SAM for Polyp Segmentation (ASPS), equipped with two modules: Cross-branch Feature Augmentation (CFA) and Uncertainty-guided Prediction Regularization (UPR). CFA integrates a trainable CNN encoder branch with a frozen ViT encoder, enabling the integration of domain-specific knowledge while enhancing local features and high-frequency details. Moreover, UPR ingeniously leverages SAM's IoU score to mitigate uncertainty during the training procedure, thereby improving OOD performance and domain generalization. Extensive experimental results demonstrate the effectiveness and utility of the proposed method in improving SAM's performance in polyp segmentation. Our code is available at <a class="link-external link-https" href="https://github.com/HuiqianLi/ASPS" rel="external noopener nofollow">this https URL</a>.
Image and Video Processing,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the limitations of the Segment Anything Model (SAM) in the task of polyp segmentation. Specifically, due to the domain gap between natural images and endoscopic images, SAM encounters the following two major issues in polyp segmentation: 1. **Loss of Local Details**: SAM's Transformer-based structure prioritizes global and low-frequency information, which may overlook local details, leading to feature bias. 2. **Poor Performance on Out-of-Domain Data**: When applied to endoscopic images, SAM performs poorly on out-of-domain data, with inaccurate predictions and biased confidence outputs. To address these issues, the authors propose a new method called "Augmented SAM for Polyp Segmentation (ASPS)," which includes two modules: - **Cross-Branch Feature Augmentation Module (CFA)**: Enhances local features and high-frequency details by introducing a trainable CNN encoder branch to complement the frozen ViT encoder. - **Uncertainty-Guided Prediction Regularization Module (UPR)**: Cleverly utilizes SAM's IoU score to reduce uncertainty during the training process, thereby improving out-of-domain performance and domain generalization capability. The effectiveness and superiority of this method are validated through extensive experiments on 5 commonly used polyp datasets.