DANet: Deformable Alignment Network for Video Inpainting

Xutong Lu,Jianfu Zhang
DOI: https://doi.org/10.1007/978-3-030-67832-6_35
2021-01-01
Abstract:The goal of video inpainting is to fill the missing holes in a given video sequence. Due to the additional dimension, the video inpainting task is considerably more challenging to generate a plausible result than the image inpainting task. In this paper, we propose a novel video inpainting network based on deformable alignment, named Deformable Alignment Network (DANet). Given several consecutive images, DANet can align the image features from the global-level to pixel-level in a coarse-to-fine fashion. After alignment, DANet applies a fusion block to fuse the aligned features with neighboring frames and generates an inpainted frame. The coarse-to-fine alignment architecture ensures a better fusion result, which leads to temporal and spatial consistency combined with the fusion block. Experiment results demonstrate that DANet is more semantically correct and temporally coherent, and is comparable with state-of-the-art video inpainting methods.
What problem does this paper attempt to address?