Break the Bias: Delving Semantic Transform Invariance for Few-Shot Segmentation

Qinglong Cao,Yuntian Chen,Chao Ma,Xiaokang Yang
DOI: https://doi.org/10.1109/tcsvt.2023.3325629
IF: 5.859
2024-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Few-shot semantic segmentation (FSS) aims to segment objects of unseen classes in query images with only a few annotated support images. Existing FSS algorithms typically focus on mining category representations from the single-view support to match semantic objects of the single-view query. However, the limited annotated samples render the single-view matching struggle to perceive the varying characteristics of novel objects, which results in a restricted learning space for novel categories and further induces a biased segmentation with demoted parsing performance. To address this challenge, inspired by the semantic transform invariance, this paper proposes a fresh few-shot segmentation framework to break the bias and perform invariant segmentation in a multi-view matching manner. Specifically, original and transform support features from different perspectives with the same semantics are learnable fused to obtain the transform invariance prototype with a stronger category representation ability. Simultaneously, aiming at providing better parsing guidance, the Transform Invariance Guidance Mask Generation (TIGM) module is proposed to integrate prior knowledge from different perspectives. Finally, segmentation predictions from varying views are complementarily merged in the Transform Invariance Semantic Prediction (TISP) module to decide the uncertain area and yield precise segmentation predictions. Extensive experiments on both PASCAL-5i and COCO-20i datasets demonstrate the effectiveness of our approach and show that our method could achieve state-of-the-art performance. Code is available at https://github.com/caoql98/BBD.
What problem does this paper attempt to address?