Learning Image-Adaptive Codebooks for Class-Agnostic Image Restoration

Kechun Liu,Yitong Jiang,Inchang Choi,Jinwei Gu
DOI: https://doi.org/10.48550/arXiv.2306.06513
2023-06-10
Computer Vision and Pattern Recognition
Abstract:Recent work on discrete generative priors, in the form of codebooks, has shown exciting performance for image reconstruction and restoration, as the discrete prior space spanned by the codebooks increases the robustness against diverse image degradations. Nevertheless, these methods require separate training of codebooks for different image categories, which limits their use to specific image categories only (e.g. face, architecture, etc.), and fail to handle arbitrary natural images. In this paper, we propose AdaCode for learning image-adaptive codebooks for class-agnostic image restoration. Instead of learning a single codebook for each image category, we learn a set of basis codebooks. For a given input image, AdaCode learns a weight map with which we compute a weighted combination of these basis codebooks for adaptive image restoration. Intuitively, AdaCode is a more flexible and expressive discrete generative prior than previous work. Experimental results demonstrate that AdaCode achieves state-of-the-art performance on image reconstruction and restoration tasks, including image super-resolution and inpainting.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: Existing discrete generation priors (in the form of codebooks) perform well in image reconstruction and inpainting tasks, but these methods require training codebooks separately for different image categories, which limits their application to arbitrary natural images. Specifically, traditional single codebooks cannot fully capture complex visual patterns, resulting in obvious artifacts when processing images with multi - category content. To solve this problem, the authors propose the AdaCode method, aiming to learn adaptive codebooks for class - agnostic image inpainting. AdaCode learns a set of base codebooks and generates weight maps according to the input image to combine these base codebooks, thereby achieving a more flexible and more expressive discrete generation prior. In this way, AdaCode can better handle natural images containing multiple semantic contents without the need to train codebooks separately for each category. ### Main Contributions 1. **Propose AdaCode**: An adaptive codebook learning method for class - agnostic image inpainting. 2. **Flexibility and Expressiveness**: By combining multiple base codebooks, AdaCode can represent complex natural images more flexibly. 3. **Wide Applicability**: AdaCode has achieved state - of - the - art performance in tasks such as image super - resolution and inpainting, verifying its effectiveness and universality. ### Method Overview The training process of AdaCode is divided into three stages: 1. **First Stage (Class - specific Codebook Pretraining)**: Divide the high - quality image dataset into multiple semantic subsets and train VQGANs for specific categories for each subset to obtain base codebooks. 2. **Second Stage (AdaCode Representation Learning)**: Use the pre - trained base codebooks as a benchmark and train AdaCode through the self - reconstruction task to learn weight maps to combine these base codebooks. 3. **Third Stage (Restoration via AdaCode)**: Use fixed base codebooks and decoders to perform fine - tuning for downstream inpainting tasks. In this way, AdaCode can significantly improve the quality and robustness of image inpainting without increasing too many parameters.