Few-shot video object segmentation with prototype evolution

Binjie Mao,Xiyan Liu,Linsu Shi,Jiazhong Yu,Fei Li,Shiming Xiang
DOI: https://doi.org/10.1007/s00521-023-09325-y
2024-01-03
Neural Computing and Applications
Abstract:As a challenging task, few-shot video object segmentation attempts to segment objects of novel categories in the video while providing only a few annotated images. Current methods for this task only explore the relationship between support images and target query video ignoring the rich temporal information in the query video itself. To address this problem, we propose a simple yet effective framework named prototype evolution network (PENet) for few-shot video object segmentation in this paper. PENet first adopts a prototype-based structure which efficiently constructs and exploits the correlation between support images and target query video. Then a prototype evolution module is designed to summarize and propagate temporal information through the evolution process of the video prototype. The feature representation adopted by the module is of fixed size and does not increase memory burden as the video frame moves forward. Along with the category prototype extracted from the support set, the global video prototype provides guidance for the current frame segmentation. Additionally, the approach of utilizing the high-level features is introduced as an optional solution that trades a small amount of speed for higher accuracy. Experimental results on the Youtube-VIS dataset of 2019 version and 2021 version demonstrate that our PENet outperforms the previous methods with a sizable margin, validating the superiority of the proposed model.
computer science, artificial intelligence
What problem does this paper attempt to address?