GAN Inversion of High-Resolution Images

Tanmay Deshmukh,Mohit Bhat
DOI: https://doi.org/10.36548/jiip.2022.2.005
2022-07-18
Journal of Innovative Image Processing
Abstract:Image generation is the task of automatically generating an image using an input vector z. In recent years, the quest to understand and manipulate this input vector has gained more and more attention due to potential applications. The previous works have shown promising results in interpreting the latent space of pre-trained Generator G to generate images up to 256 x 256 using supervised and unsupervised techniques. This paper addresses the challenge of interpreting the latent space of pre-trained Generator G to generate high-resolution images, i.e., images with resolution up to 1024x1024. This problem is tackled by proposing a new framework that iterates upon Cyclic Reverse Generator (CRG) by upgrading Encoder E present in CRG to handle high-resolution images. This model can successfully interpret the latent space of the generator in complex generative models like Progressive Growling Generative Adversarial Network (PGGAN) and StyleGAN. The framework then maps input vector zf with image attributes defined in the dataset. Moreover, it gives precise control over the output of generator models. This control over generator output is tremendously helpful in enhancing computer vision applications like photo editing and face manipulation. One downside of this framework is the reliance on a comprehensive dataset, thus limiting the use of it.
What problem does this paper attempt to address?