Joint Attention Mechanism for Unsupervised Video Object Segmentation

Rui Yao,Xin Xu,Yong Zhou,Jiaqi Zhao,Liang Fang
DOI: https://doi.org/10.1007/978-3-030-88004-0_13
2021-01-01
Abstract:In this work, we propose an unsupervised video object segmentation framework based on a joint attention mechanism. Based on the feature extraction of video frames, this method constructs a joint attention module to mine the correlation information between different frames of the same video, and uses the global consistency information of the video to guide the segmentation. The joint attention module includes a soft attention unit and a co-attention unit. The former emphasizes important information in the feature embedding of a frame, and the latter enhances the features of the current frame by calculating the correlation between features from different frames. Furthermore, in order to exchange information more comprehensively and deeply in different frames, superimposing the joint attention module can achieve better performance. We conducted experiments on several benchmark datasets to verify the effectiveness of our algorithm, experimental results show that the joint attention module can capture global consistency information significantly and improves the accuracy of segmentation.
What problem does this paper attempt to address?