GRIG: Few-Shot Generative Residual Image Inpainting

Wanglong Lu,Xianta Jiang,Xiaogang Jin,Yong-Liang Yang,Minglun Gong,Tao Wang,Kaijie Shi,Hanli Zhao
2023-04-24
Abstract:Image inpainting is the task of filling in missing or masked region of an image with semantically meaningful contents. Recent methods have shown significant improvement in dealing with large-scale missing regions. However, these methods usually require large training datasets to achieve satisfactory results and there has been limited research into training these models on a small number of samples. To address this, we present a novel few-shot generative residual image inpainting method that produces high-quality inpainting results. The core idea is to propose an iterative residual reasoning method that incorporates Convolutional Neural Networks (CNNs) for feature extraction and Transformers for global reasoning within generative adversarial networks, along with image-level and patch-level discriminators. We also propose a novel forgery-patch adversarial training strategy to create faithful textures and detailed appearances. Extensive evaluations show that our method outperforms previous methods on the few-shot image inpainting task, both quantitatively and qualitatively.
Computer Vision and Pattern Recognition,Multimedia
What problem does this paper attempt to address?
The paper primarily addresses the problem of how to train a high-quality image inpainting model using a limited dataset (i.e., few-shot) in the task of image inpainting. Specifically, the paper improves upon several key issues present in current image inpainting methods: 1. **Data Efficiency**: Existing methods usually require a large amount of training data to achieve good results, which can be very difficult or costly to obtain in certain specific fields (such as medical, art, etc.). 2. **Overfitting Problem**: When the model is trained on a smaller dataset, it is prone to overfitting. 3. **Application of Iterative Inference and Residual Learning**: Current image inpainting methods fail to fully utilize previously predicted information during iterative inference and lack designs tailored for few-shot learning. To address the above issues, the authors propose a new framework—**GRIG (Generative Residual Image inpaintinG)**, which combines Convolutional Neural Networks (CNNs), Transformers, and Generative Adversarial Networks (GANs). It enhances the model's generalization ability and inpainting quality through iterative residual inference and residual learning strategies. Additionally, GRIG introduces an innovative adversarial training strategy with fake block to improve the realism of textures and detail representation. Experiments on 10 different content-featured few-shot datasets show that GRIG outperforms existing state-of-the-art methods in both quantitative evaluation metrics (such as FID and LPIPS) and visual quality. This demonstrates the effectiveness and superiority of GRIG in handling few-shot image inpainting tasks.