Abstract:Digital humans and, especially, 3D facial avatars have raised a lot of attention in the past years, as they are the backbone of several applications like immersive telepresence in AR or VR. Despite the progress, facial avatars reconstructed from commodity hardware are incomplete and miss out on parts of the side and back of the head, severely limiting the usability of the avatar. This limitation in prior work stems from their requirement of face tracking, which fails for profile and back views. To address this issue, we propose to learn person-specific animatable avatars from images without assuming to have access to precise facial expression tracking. At the core of our method, we leverage a 3D-aware generative model that is trained to reproduce the distribution of facial expressions from the training data. To train this appearance model, we only assume to have a collection of 2D images with the corresponding camera parameters. For controlling the model, we learn a mapping from 3DMM facial expression parameters to the latent space of the generative model. This mapping can be learned by sampling the latent space of the appearance model and reconstructing the facial parameters from a normalized frontal view, where facial expression estimation performs well. With this scheme, we decouple 3D appearance reconstruction and animation control to achieve high fidelity in image synthesis. In a series of experiments, we compare our proposed technique to state-of-the-art monocular methods and show superior quality while not requiring expression tracking of the training data.

NECA: Neural Customizable Human Avatar

Unified Volumetric Avatar: Enabling Flexible Editing and Rendering of Neural Human Representations

Relightable and Animatable Neural Avatars from Videos

HR Human: Modeling Human Avatars with Triangular Mesh and High-Resolution Textures from Videos

Relightable and Animatable Neural Avatar from Sparse-View Video

Learning Locally Editable Virtual Humans

Neural Head Avatars from Monocular RGB Videos

Learning Personalized High Quality Volumetric Head Avatars from Monocular RGB Videos

AvatarReX: Real-time Expressive Full-body Avatars

Animatable Neural Radiance Fields from Monocular RGB Videos

Artist-Friendly Relightable and Animatable Neural Heads

URAvatar: Universal Relightable Gaussian Codec Avatars

High-Fidelity Human Avatars from a Single RGB Camera

GAN-Avatar: Controllable Personalized GAN-based Human Head Avatar

Deformable 3D Gaussian Splatting for Animatable Human Avatars

Reality's Canvas, Language's Brush: Crafting 3D Avatars from Monocular Video

LightAvatar: Efficient Head Avatar as Dynamic Neural Light Field

HybridAvatar: Efficient Mesh-based Human Avatar Generation from Few-Shot Monocular Images with Implicit Mesh Displacement

AvatarWild: Fully Controllable Head Avatars in the Wild

HQ3DAvatar: High Quality Implicit 3D Head Avatar

MagicMirror: Fast and High-Quality Avatar Generation with a Constrained Search Space