Asymmetric Label Propagation for Video Object Segmentation.
Zhen Chen,Ming Yang,Shiliang Zhang
DOI: https://doi.org/10.1145/3551626.3564943
2022-01-01
Abstract:Semi-supervised video object segmentation aims to segment foreground objects across a video sequence based on their masks given at the first frame. The motion in adjacent frames tends to be smooth, yet object appearances could change substantially in subsequent frames due to clutters or occlusions. Most existing works segment a video frame by equally referring to segmentation masks of its previous frame and the first frame, and are prone to unreliable matching and accumulated segmentation errors. In order to alleviate this issue, this paper proposes to treat the first and previous frames differently to leverage the motion and appearance clues reliably, and presents an Asymmetric Label Propagation (ALP) method. ALP consists of a Confidence-guided Local Propagation (CLP) module and a Global Label Matching (GLM) module, respectively. CLP propagates labels from the previous frame to the current frame based on local affinity and appearance matching uncertainty. To further recover potential missing objects and alleviate error accumulation, GLM matches the current frame to both the foreground and background of the first frame, and adaptively fuses their matching results. The CLP and GLM outputs are fused to generate object-specific feature maps to perform multi-object segmentation. Extensive experiments on DAVIS and Youtube-VOS datasets demonstrate the effectiveness of the proposed method.
What problem does this paper attempt to address?