Deep Learning-based Image and Video Inpainting: A Survey

Weize Quan,Jiaxi Chen,Yanli Liu,Dong-Ming Yan,Peter Wonka
2024-01-07
Abstract:Image and video inpainting is a classic problem in computer vision and computer graphics, aiming to fill in the plausible and realistic content in the missing areas of images and videos. With the advance of deep learning, this problem has achieved significant progress recently. The goal of this paper is to comprehensively review the deep learning-based methods for image and video inpainting. Specifically, we sort existing methods into different categories from the perspective of their high-level inpainting pipeline, present different deep learning architectures, including CNN, VAE, GAN, diffusion models, etc., and summarize techniques for module design. We review the training objectives and the common benchmark datasets. We present evaluation metrics for low-level pixel and high-level perceptional similarity, conduct a performance evaluation, and discuss the strengths and weaknesses of representative inpainting methods. We also discuss related real-world applications. Finally, we discuss open challenges and suggest potential future research directions.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper is a survey of deep learning-based image and video inpainting techniques. Image and video inpainting aims to fill and restore missing or occluded regions in digital images or videos with plausible and realistic content. With the development of deep learning, significant progress has been made in this field of research. The goal of the article is to comprehensively review these deep learning-based methods, classify existing methods according to their high-level inpainting processes, and introduce different deep learning architectures such as Convolutional Neural Networks (CNN), Variational Autoencoders (VAE), Generative Adversarial Networks (GAN), and Diffusion Models. In addition, the paper summarizes module design techniques, training objectives, commonly used benchmark datasets, and evaluation metrics for pixel-level and perceptual-level similarity assessment. The authors evaluate the performance of representative inpainting methods, discuss their strengths and limitations, and explore relevant real-world applications. Finally, the paper presents the current challenges and potential research directions.