G2LP-Net: Global to Local Progressive Video Inpainting Network

Zhong Ji,Jiacheng Hou,Yimu Su,Yanwei Pang,Xuelong Li
DOI: https://doi.org/10.1109/tcsvt.2022.3209548
IF: 5.859
2022-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:The self-attention based video inpainting methods have achieved promising progress by establishing long-range correlation over the whole video. However, existing methods generally relied on the global self-attention that directly searches missing contents among all reference frames but lacks accurate matching and effective organization on contents, which often blurs the result owing to the loss of local textures. In this paper, we propose a Global-to-Local Progressive Inpainting Network (G2LP-Net) consisting of the following innovative ideas. First, we present a global to local self-attention mechanism by incorporating local self-attention into global self-attention to improve searching efficiency and accuracy, where the self-attention is implemented in multi-scale regions to fully exploit local redundancy for the texture recovery. Second, we propose a progressive video inpainting (PVI) method to organize the generated contents, which completes the target video frames from periphery to core to ensure reliable contents serve first. Last, we develop a window-sliding method for sampling reference frames to obtain rich available information for inpainting. In addition, we release a wire-removal video (WRV) dataset that consists of 150 video clips masked by wires to evaluate the video inpainting on irregularly slender regions. Both quantitative and qualitative experiments on benchmark datasets, DAVIS, YouTube-VOS and our WRV dataset have demonstrated the superiority of our proposed G2LP-Net method.
What problem does this paper attempt to address?