Memory Aggregated CFBI+ for Interactive Video Object Segmentation

Chen Liang,Zongxin Yang,Jiaxu Miao,Yunchao Wei,Yi Yang
2020-01-01
Abstract:In this paper, we propose a novel framework for tackling the interactive video object segmentation. To deal with the gradually increasing scribble information, our framework applies two independent networks for conducting user interaction and temporal propagation. For the former part, we adopt an inside-outside single-object coarseto-fine structure augmented with a pyramid scene parsing module for aggregating global contextual information (IOI-Net). For the temporal propagation part, to record the informative knowledge from previous interaction rounds, the proposed model (MCFBI-Net) adopts a simple yet effective memory aggregation mechanism based on the Collaborative video object segmentation by Multi-scale Foreground-Background Integration (CFBI+) method, which fully utilizes the rich information from both foreground pixels and background pixels. Besides, we introduce the High Confidence Filter and the Background Random Drop Mechanism in this paper to improve the robustness in discovering challenging objects. Our approach took the 2nd place according to 7 &J@ 60s and the 3rd place with AUC score on interactive track in DAVIS Challenge on Video Object Segmentation 2020.
What problem does this paper attempt to address?