LinkGAN: Linking GAN Latents to Pixels for Controllable Image Synthesis

Jiapeng Zhu,Ceyuan Yang,Yujun Shen,Zifan Shi,Bo Dai,Deli Zhao,Qifeng Chen
2023-09-25
Abstract:This work presents an easy-to-use regularizer for GAN training, which helps explicitly link some axes of the latent space to a set of pixels in the synthesized image. Establishing such a connection facilitates a more convenient local control of GAN generation, where users can alter the image content only within a spatial area simply by partially resampling the latent code. Experimental results confirm four appealing properties of our regularizer, which we call LinkGAN. (1) The latent-pixel linkage is applicable to either a fixed region (\textit{i.e.}, same for all instances) or a particular semantic category (i.e., varying across instances), like the sky. (2) Two or multiple regions can be independently linked to different latent axes, which further supports joint control. (3) Our regularizer can improve the spatial controllability of both 2D and 3D-aware GAN models, barely sacrificing the synthesis performance. (4) The models trained with our regularizer are compatible with GAN inversion techniques and maintain editability on real images.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to address the challenge of precisely controlling local regions in image synthesis with Generative Adversarial Networks (GANs). Specifically, it attempts to solve the following problems: 1. **Instability and inaccuracy**: Existing methods usually rely on post - hoc analysis when discovering the relationship between the latent space and the image, which leads to instability and inaccuracy of the results. For example, it is difficult to find semantically meaningful sub - spaces in high - dimensional latent spaces. 2. **Lack of flexibility**: Most of the existing image editing methods are linear and based on vector arithmetic, which limits the diversity of editing and makes it difficult to achieve fine - grained control of specific regions. 3. **Lack of explicit connection**: Although some works have shown that certain sub - spaces of the latent space can control the local semantics of the output image, these methods lack an explicit connection between the latent axes and the image pixels, making local control less precise. To solve these problems, the paper proposes a new regularizer called LinkGAN. By introducing this regularizer during the training process, some axes of the latent space can be explicitly linked to a specific set of pixels in the image, thereby achieving more accurate and convenient local control. ### Main contributions of LinkGAN - **Explicit linking**: LinkGAN can explicitly link certain axes of the latent space to a specific set of pixels of the image, allowing users to change only a local region in the image by partially resampling the latent code. - **Independent control of multiple regions**: LinkGAN supports independent linking of multiple regions to different latent axes, allowing for joint manipulation of these regions. - **Applicable to 2D and 3D GAN models**: LinkGAN is applicable not only to 2D image synthesis models but also to 3D - aware image synthesis models, improving spatial controllability without significantly sacrificing synthesis performance. - **Compatible with GAN inversion techniques**: Models trained with LinkGAN are compatible with GAN inversion techniques and can maintain editability on real images. In conclusion, LinkGAN provides a new perspective for learning controllable image synthesis methods, enabling generative models to control image content more flexibly and precisely.