Abstract:Few-shot Semantic Segmentation (FSS) attempts to segment the new category with only a few labeled samples, presenting a significant challenge. Existing approaches primarily focus on leveraging category information from the support set to identify objects of the new category in the query image. However, these models often struggle when confronted with substantial differences between paired images. To address issues stemming from scenario differences and intra-class diversity, this paper proposes an adaptive similarity-guided self-merging network. Firstly, style differences of multi-level features are introduced to alleviate the network's sensitivity to scenario variations and learn an adaptive weight for the K-shot scheme. Secondly, a feature-mask bi-aggregation module is designed to learn an enhanced feature and an initial mask for the query image. Within this module, dynamic correlations cover all the spatial locations, providing global information crucial for feature and mask aggregation. Subsequently, a self-merging module is proposed to alleviate prototype bias. It merges a self-prototype derived from the initial mask with an adaptive weighted support prototype obtained from K support images. Finally, the target object is segmented using the enhanced feature and merging prototype, and segmentation results are further refined by predictions of base categories and an adjustment factor derived from multilevel style differences. The proposed method achieves 69.1% (1-shot) and 72.3% (5-shot) mIoU on the PASCAL-5i dataset, and 47.4% (1-shot) and 52.1% (5-shot) mIoU on the COCO-20i dataset. These results demonstrate state-of-the-art segmentation performance compared to mainstream methods. (c) 2017 Elsevier Inc. All rights reserved.

Break the Bias: Delving Semantic Transform Invariance for Few-Shot Segmentation

Reflection Invariance Learning for Few-shot Semantic Segmentation

Dense Cross-Query-and-Support Attention Weighted Mask Aggregation for Few-Shot Segmentation

CGMGM: A Cross-Gaussian Mixture Generative Model for Few-Shot Semantic Segmentation

Iterative Few-shot Semantic Segmentation from Image Label Text

A Joint Framework Towards Class-aware and Class-agnostic Alignment for Few-shot Segmentation

Adaptive Similarity-Guided Self-Merging Network for Few-Shot Semantic Segmentation

Bidirectional Reciprocative Information Communication for Few-Shot Semantic Segmentation

Boosting Few-Shot Segmentation via Instance-Aware Data Augmentation and Local Consensus Guided Cross Attention

Few-Shot Segmentation via Channel Attention and Supervision Augmentation

MetaMask: Improving Few-Shot Semantic Segmentation Via Multi-Mask Calibriation

Few-Shot Segmentation Via Divide-and-Conquer Proxies

Learning What Not to Segment: A New Perspective on Few-Shot Segmentation

Dual Branch Multi-Level Semantic Learning for Few-Shot Segmentation

Self-Support Few-Shot Semantic Segmentation

Bi-aggregation-aggregation and Self-Merging Network for Few-Shot Image Semantic Segmentation

Anti-aliasing Semantic Reconstruction for Few-Shot Semantic Segmentation

Rethinking and Improving Few-Shot Segmentation from a Contour-Aware Perspective

Few-shot Semantic Segmentation Via Perceptual Attention and Spatial Control

Learning discriminative foreground-and-background features for few-shot segmentation

Blessing Few-Shot Segmentation Via Semi-Supervised Learning with Noisy Support Images