Using Guided Self-Attention with Local Information for Polyp Segmentation

Linghan Cai,Meijing Wu,Lijiang Chen,Wenpei Bai,Min Yang,Shuchang Lyu,Qi Zhao
DOI: https://doi.org/10.1007/978-3-031-16440-8_60
2022-01-01
Abstract:Automatic and precise polyp segmentation is crucial for the early diagnosis of colorectal cancer. Existing polyp segmentation methods are mostly based on convolutional neural networks (CNNs), which usually utilize the global features to enhance local features through well-designed modules, thereby dealing with the diversity of polyps. Although CNN-based methods achieve impressive results, they are powerless to model explicit long-range relations, which limits their performance. Different from CNN, Transformer has a strong capability of modeling long-range relations owing to self-attention. However, self-attention always spreads attention to unexpected regions and the Transformer's ability of local feature extraction is insufficient, resulting in inaccurate localization and fuzzy boundary. To address these issues, we propose PPFormer for accurate polyp segmentation. Specifically, we first adopt a shallow CNN encoder and a deep Transformer encoder to extract rich features. In the decoder, we present the PP-guided self-attention that uses prediction maps to guide self-attention to focus on the hard regions so as to enhance the model's perception of polyp boundary. Meanwhile, the Local-to-Global mechanism is designed to encourage the Transformer to capture more information in the local-window for better polyp localization. Extensive experiments on five challenging datasets show that PPFormer outperforms other advanced methods and achieves state-of-the-art results with six metrics, i.e. mean Dice and mean IoU.
What problem does this paper attempt to address?