Lightweight Video Object Segmentation Based on ConvGRU.

Rui Yao,Yikun Zhang,Cunyuan Gao,Yong Zhou,Jiaqi Zhao,Lina Liang
DOI: https://doi.org/10.1007/978-3-030-31723-2_37
2019-01-01
Abstract:As one of the key tasks of video processing, video object segmentation technology is the foundation of high-level computer vision application. The spatio-temporal context information in the video is of great significance for video object segmentation. Existing algorithms usually introduce spatio-temporal context information with pre-trained models such as optical flow for segmentation, which will result in sub-optimal solution and huge computational resource consumption. To address the above problem, this paper proposes an end-to-end lightweight video object segmentation model based on ConvGRU. A convolutional neural network is used to extract the visual features of each frame, and recursive neural network is used to extract the spatio-temporal context information of the whole video. The ConvGRU is used to achieve the deep fusion of visual features and spatial-temporal context information. The MobileNet-based lightweight algorithm can meet the demand for practical application and solve the problem of high consumption for computing resources. Experiments on DAVIS2016 dataset show that our method is competitive with similar state-of-the-art methods.
What problem does this paper attempt to address?