Abstract:Facial image editing is one of the hot topics in recent years due to the great development in deep generative models. Current models are either based on variational autoencoder(VAE) or generative adversarial network(GAN). However, VAE-based models usually generate oversmooth images, while GAN-based-only models cannot randomly generate images with specific attributes and suffer from unstable training. To overcome these limitations, a novel attribute-disentangled generative model based on the combination of VAE and GAN is proposed for facial image editing by manipulating specific attributes and synthesizing facial images conditioned on the specified attributes. In the encoder-decoder architecture of the proposed model, the latent space mapped by the encoder is split into two subspaces: the attribute-irrelevant space and the attribute-relevant space. The attribute-irrelevant space characterizes the factors such as identity, position, background etc, which are expected to be kept unchanged during the editing. The attribute-relevant space is used to represent the attributes such as hair color, gender, age etc that we want to manipulate. We use the adversarial training scheme to train the model, where images generated by the proposed model are re-feeded to the encoder to ensure their distribution is close to the real data distribution in the attribute-irrelevant subspace while they can be correctly classified in the attribute-relevant subspace, without explicitly giving the discriminators such as in GANs. To evaluate the performance of the proposed model, quantitative and qualitative comparisons between the proposed model and other state-of-the-art algorithms were tesed on the CelebA dataset. The evaluation results show that the proposed model can effectively generate high-quality facial images with diverse specified attributes.

Self-supervised Deformation Modeling for Facial Expression Editing

Realistic Face Reenactment Via Self-Supervised Disentangling of Identity and Pose

Video Tracked Facial Expression Animation

Neural Face Editing with Intrinsic Image Disentangling

Facial Landmarks and Expression Label Guided Photorealistic Facial Expression Synthesis

Facial Attribute Editing Using Generative Adversarial Network

Intuitive Facial Animation Editing Based On A Generative RNN Framework

Facial Expression Editing with Continuous Emotion Labels

DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control

Video-Driven Neural Physically-Based Facial Asset for Production

A Robust Interactive Facial Animation Editing System

Self-Supervised Emotion Representation Disentanglement for Speech-Preserving Facial Expression Manipulation

Continuously Controllable Facial Expression Editing in Talking Face Videos

High-Fidelity Face Manipulation With Extreme Poses and Expressions

Realistic Dynamic Facial Textures from a Single Image Using GANs

Deep Energies for Estimating Three-Dimensional Facial Pose and Expression

A Unified Architecture of Semantic Segmentation and Hierarchical Generative Adversarial Networks for Expression Manipulation

Toward Fine-grained Facial Expression Manipulation

A novel attribute-based generation architecture for facial image editing

Generative Adversarial Talking Head: Bringing Portraits to Life with a Weakly Supervised Neural Network

DynamicAvatars: Accurate Dynamic Facial Avatars Reconstruction and Precise Editing with Diffusion Models