Integrating instance-level knowledge to see the unseen: A two-stream network for video object segmentation

Hannan Lu,Zhi Tian,Pengxu Wei,Haibing Ren,Wangmeng Zuo
DOI: https://doi.org/10.1016/j.neucom.2024.127878
IF: 6
2024-05-19
Neurocomputing
Abstract:Existing matching-based video object segmentation (VOS) approaches carry inherent limitations in segmenting pixels that have never appeared in the previous frames ( i.e. , unseen pixels). In this paper, we introduce a T wo- S tream N etwork (TSN), which addresses this issue by distinguishing between seen and unseen pixels softly and processes them with two streams. Particularly, a pixel division module is devised to generate a routing map, distinguishing between seen and unseen pixels. Guided by the routing map, TSN integrates instance-level knowledge from an instance stream and pixel-level information from a pixel stream explicitly, generating the final segmentation result. The soft partitioning strategy allows for flexibility and adaptability in the fusion process. Additionally, the compact instance stream encodes and leverages instance-level knowledge, resulting in improved segmentation accuracy of the unseen pixels. Extensive experiments demonstrate the effectiveness of our proposed TSN, and we also report state-of-the-art performance on public VOS benchmarks.
computer science, artificial intelligence
What problem does this paper attempt to address?