Part-aware attention correctness for video salient object detection
Ze-yu Liu,Jian-wei Liu
DOI: https://doi.org/10.1016/j.engappai.2022.105733
IF: 8
2022-12-25
Engineering Applications of Artificial Intelligence
Abstract:Video salient object detection (VSOD), aiming to detect the most conspicuous objects or regions in a video, has become an important research topic over the past few years. Preliminary studies mainly focus on spatial–temporal architecture that heavily relies on implicit attention model to aggregate complementary information from adjacent video frames. Despite the remarkable improvements, existing approaches pay little attention to cross-video affinities, which is important to build explicit attention schema for VSOD. To this end, we propose a novel attention correctness strategy to supervise the aggregation process. Specifically, different from previous works, we employ pairwise training schema, leveraging both positive and negative aggregation supervision to explore inter-video affinities for VSOD. The proposed mechanism successfully suppresses negative correspondence for video frames and reinforces discriminative feature mining for conspicuous objects. To enhance intra-video correspondence, we propose part-aware similarity aggregation module that helps intra-video affinities to segment the salient objects with video-level context. Extensive experiments are conducted on six popular benchmarks, including FBMS, DAVIS, DAVSOD, SegTrack-V2, VOS and ViS. Experimental results on challenging scenes (i.e., for DAVSOD-T, we achieve an improvement of 0.4% for MAE, 1.1% for maximum F-measure and 0.5% for S-measure compared with other competitive models) demonstrate the effectiveness of our proposed method.
automation & control systems,computer science, artificial intelligence,engineering, electrical & electronic, multidisciplinary