Abstract:Conditional GANs are frequently used for manipulating the attributes of face images, such as expression, hairstyle, pose, or age. Even though the state-of-the-art models successfully modify the requested attributes, they simultaneously modify other important characteristics of the image, such as a person's identity. In this paper, we focus on solving this problem by introducing PluGeN4Faces, a plugin to StyleGAN, which explicitly disentangles face attributes from a person's identity. Our key idea is to perform training on images retrieved from movie frames, where a given person appears in various poses and with different attributes. By applying a type of contrastive loss, we encourage the model to group images of the same person in similar regions of latent space. Our experiments demonstrate that the modifications of face attributes performed by PluGeN4Faces are significantly less invasive on the remaining characteristics of the image than in the existing state-of-the-art models.

What problem does this paper attempt to address?

This paper proposes a solution to the problem of inadvertently changing the identity of a person when using conditional generative adversarial networks (such as StyleGAN) for attribute editing of facial images. In their research, they developed a plugin model called PluGeN4Faces, which decouples the latent space of StyleGAN, so that modifications of facial attributes do not significantly affect the person's identity and other facial features. The key innovation lies in their use of images from movie frames for training, which show the same person in different poses and attributes. Through a contrastive loss function, the model is encouraged to cluster images of the same person in similar regions of the latent space. Specifically, PluGeN4Faces is a conditional invertible normalization flow module attached to the style space of StyleGAN. It transforms the pre-trained style codes of StyleGAN into a decoupled space, where each labeled attribute is modeled by separate latent dimensions, and images of the same person are clustered in similar regions of the latent space. To achieve this, they utilize conditional invertible normalization flow (cINF), which transforms the style codes generated by the StyleGAN encoder conditioned on the layer index of the style code. Experiments demonstrate that compared to existing models, PluGeN4Faces significantly reduces the impact of editing facial attributes on other image features (including person identity). Furthermore, the paper provides quantitative analysis to demonstrate the advantages of PluGeN4Faces over related models. Overall, the contributions of this paper include: 1. Proposing a StyleGAN plugin model for editing attributes of real images, trained on real images and using an encoder network to encode images into the style space of StyleGAN. 2. Improving the representation disentanglement of conditional generative models and explicitly encoding person identity through the application of contrastive loss, reducing the intrusiveness of requested attribute modifications on other image characteristics (including identity). 3. Rigorously evaluating the proposed solution and conducting fair comparisons with related models in terms of performance.

Face Identity-Aware Disentanglement in StyleGAN

StyleFace: Towards Identity-Disentangled Face Generation on Megapixels.

Two Birds with One Stone: Iteratively Learn Facial Attributes with GANs.

StyleIPSB: Identity-Preserving Semantic Basis of StyleGAN for High Fidelity Face Swapping

MOST-GAN: 3D Morphable StyleGAN for Disentangled Face Image Manipulation

Two Birds with One Stone: Transforming and Generating Facial Images with Iterative GAN

Controllable and Identity-Aware Facial Attribute Transformation

StyleID: Identity Disentanglement for Anonymizing Faces

Towards Spatially Disentangled Manipulation of Face Images With Pre-Trained StyleGANs

ExFaceGAN: Exploring Identity Directions in GAN's Learned Latent Space for Synthetic Identity Generation

How to Boost Face Recognition with StyleGAN?

FacialGAN: Style Transfer and Attribute Manipulation on Synthetic Faces

A Latent Transformer for Disentangled Face Editing in Images and Videos

Which Style Makes Me Attractive? Interpretable Control Discovery and Counterfactual Explanation on StyleGAN

Controllable Face Image Editing in a Disentanglement Way

Generate Identity-Preserving Faces by Generative Adversarial Networks.

Identity-preserving Editing of Multiple Facial Attributes by Learning Global Edit Directions and Local Adjustments

High-Fidelity Face Swapping with Style Blending

Evading Forensic Classifiers with Attribute-Conditioned Adversarial Faces

InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs

PluGeN: Multi-Label Conditional Generation from Pre-trained Models