Hierarchical Vector-Quantized Variational Autoencoder and Vector Credibility Mechanism for High-Quality Image Inpainting

Cheng Li,Dan Xu,Kuai Chen
DOI: https://doi.org/10.3390/electronics13101852
IF: 2.9
2024-05-10
Electronics
Abstract:Image inpainting infers the missing areas of a corrupted image according to the information of the undamaged part. Many existing image inpainting methods can generate plausible inpainted results from damaged images with the fast-developed deep-learning technology. However, they still suffer from over-smoothed textures or textural distortion in the cases of complex textural details or large damaged areas. To restore textures at a fine-grained level, we propose an image inpainting method based on a hierarchical VQ-VAE with a vector credibility mechanism. It first trains the hierarchical VQ-VAE with ground truth images to update two codebooks and to obtain two corresponding vector collections containing information on ground truth images. The two vector collections are fed to a decoder to generate the corresponding high-fidelity outputs. An encoder then is trained with the corresponding damaged image. It generates vector collections approximating the ground truth by the help of the prior knowledge provided by the codebooks. After that, the two vector collections pass through the decoder from the hierarchical VQ-VAE to produce the inpainted results. In addition, we apply a vector credibility mechanism to promote vector collections from damaged images and approximate vector collections from ground truth images. To further improve the inpainting result, we apply a refinement network, which uses residual blocks with different dilation rates to acquire both global information and local textural details. Extensive experiments conducted on several datasets demonstrate that our method outperforms the state-of-the-art ones.
engineering, electrical & electronic,physics, applied,computer science, information systems
What problem does this paper attempt to address?
The paper attempts to address the issue in image inpainting where existing methods often produce overly smooth or texture-distorted results when dealing with complex texture details or large damaged areas. To restore textures at a fine-grained level, the authors propose an image inpainting method based on Hierarchical Vector Quantized Variational Autoencoder (Hierarchical VQ-VAE) and a vector confidence mechanism. This method aims to utilize prior knowledge from real images to improve the quality of the repaired images, especially when handling complex textures and large damaged areas. Specifically, the main contributions of the paper include: 1. Training a network based on Hierarchical VQ-VAE using real images to update two codebooks and obtain two sets of vectors, which can generate corresponding high-fidelity outputs through the decoder. The codebooks contain global and local information from real images, providing necessary information for another encoder to restore the image. 2. Introducing a vector confidence mechanism that encourages the two sets of vectors generated by the encoder using the damaged image as input to be close to the vector sets from real images. These vector sets are then used by the decoder to generate the repaired image. 3. Employing a multi-dilation rate repair module that uses residual blocks with different dilation rates to further enhance the final output, thereby capturing global information and local texture details. Through these methods, the paper aims to improve the quality of image inpainting, particularly in handling complex textures and large damaged areas, to generate more natural and high-fidelity repair results.