Bi-aggregation-aggregation and Self-Merging Network for Few-Shot Image Semantic Segmentation

Yu Liu,Ming Yu,Ye Zhu
DOI: https://doi.org/10.37188/cjlcd.2024-0074
2024-01-01
Chinese Journal of Liquid Crystals and Displays
Abstract:Few-shot image semantic segmentation is a very challenging task that attempts to segmem objects of new classes using only a few labeled samples. The mamatmam methods often have problems of low discriminative feature and prototype deviation. To alleviate these problems, a new lowshot image semantic segmentation method hased on a bisaggregation and self-merging network is proposed, which can fully mine the similarity of features and reduce prototype hias. Firstly, we propose a feature-mask bi-aggregation module to provide global semantic information for the feature aggregation and mask aggregation by constructing a dense similarity relation between the support features and the query features covering all spatial locations. Specifically, an enhanced feature and an initial mask with guiding information can be obtained for the query image by performing feature and mask bi-aggregation on the similarity matrices. Then, a self-merging decoder is proposed, which reduces the prototype bias by adding the initial mask-based self-prototype with the known support prototypes, and conveys rich category semantic information to the decoder by fusing the merged prototype with the enhancement feature. Finally, the prediction results obtained by the decoder are further optimized by the prediction results of the base classes. The mlou values of our method on the dataset PASCAL-5 achieve 68.33% and 71.5% in the 1-shot and 5-shot cases, respectively, and on the dataset COCO-20 nchieve 46.5% and 51.4% in the 1-shot and 5-shot cases, respectively, which is superior to the segmentation performance of the mainstream methods, and can segment the target region of the new class more accurately.
What problem does this paper attempt to address?