CFENet: Leveraging CLIP Text Features for Enhanced Few-Shot Semantic Segmentation

Zeyu Zhao,Zhong Chen
DOI: https://doi.org/10.1109/auteee60196.2023.10407616
2023-01-01
Abstract:Few-shot semantic segmentation (FSS) confronts the task of constructing class-agnostic models capable of effectively segmenting novel classes with minimal annotations. Our paper introduces a novel approach to FSS by integrating a text feature module, specifically leveraging the CLIP’s text feature module (CFM). Our primary objective is to augment segmentation model performance on novel classes with handful annotated images through the strategic inclusion of textual information. Simultaneously, we propose a multi-scale prior correlation module (MPM) that thoroughly exploits support-query image features at different scales, effectively supplementing and enhancing the prototype features in the images. The experimental results on the PASCAL-5i and COCO-20i dataset demonstrate a substantial performance boost in FSS task. The findings showcase the efficacy of our method, highlighting the excellent performance of the CFM and the MPM. This underscores the robustness and adaptability of our approach, establishing it as a promising solution for addressing the challenges posed by FSS task.
What problem does this paper attempt to address?