Contextual Augmentation with Bias Adaptive for Few-Shot Video Object Segmentation.

Shuaiwei Wang,Zhao Liu,Jie Lei,Zunlei Feng,Juan Xu,Xuan Li,Ronghua Liang
DOI: https://doi.org/10.1007/978-3-031-53305-1_27
2024-01-01
Abstract:Few-shot video object segmentation (FSVOS) is a challenging task that aims to segment new object classes across query videos with limited annotated support images. Typically, meta learner is the main approach to handle few-shot tasks. However, the current meta learner ignores contextual information and lacks the use of temporal information in videos. Moreover, the trained models are biased towards the segmentation of novel classes, favoring the seen class, which hinders the recognition of novel classes. To address these problems, we propose contextual augmentation with bias adaptive for few-shot video object segmentation, consisting of a context augmented learner (CAL) and a bias adaptive learner (BAL). The context augmented learner processes the contextual information in the video and guides the meta learner to obtain rough prediction results. Afterwards, the bias adaptive learner adapts to the bias of the novel classes. The BAL branch utilizes a base class learner to identify the base classes and compute the similarity between the query video and the support set, guiding the adaptive integration of coarse-robust results to generate accurate segmentation. Experiments conducted on the Youtube-VIS dataset demonstrate that our approach achieves state-of-the-art performance.
What problem does this paper attempt to address?