Abstract:Multi-view volumetric rendering techniques have recently shown great potential in modeling and synthesizing high-quality head avatars. A common approach to capture full head dynamic performances is to track the underlying geometry using a mesh-based template or 3D cube-based graphics primitives. While these model-based approaches achieve promising results, they often fail to learn complex geometric details such as the mouth interior, hair, and topological changes over time. This paper presents a novel approach to building highly photorealistic digital head avatars. Our method learns a canonical space via an implicit function parameterized by a neural network. It leverages multiresolution hash encoding in the learned feature space, allowing for high-quality, faster training and high-resolution rendering. At test time, our method is driven by a monocular RGB video. Here, an image encoder extracts face-specific features that also condition the learnable canonical space. This encourages deformation-dependent texture variations during training. We also propose a novel optical flow based loss that ensures correspondences in the learned canonical space, thus encouraging artifact-free and temporally consistent renderings. We show results on challenging facial expressions and show free-viewpoint renderings at interactive real-time rates for medium image resolutions. Our method outperforms all existing approaches, both visually and numerically. We will release our multiple-identity dataset to encourage further research. Our Project page is available at: <a class="link-external link-https" href="https://vcai.mpi-inf.mpg.de/projects/HQ3DAvatar/" rel="external noopener nofollow">this https URL</a>

FlashAvatar: High-fidelity Head Avatar with Efficient Gaussian Embedding

Generalizable and Animatable Gaussian Head Avatar

FATE: Full-head Gaussian Avatar with Textural Editing from Monocular Video

FAGhead: Fully Animate Gaussian Head from Monocular Videos

GAF: Gaussian Avatar Reconstruction from Monocular Videos via Multi-view Diffusion

GGAvatar: Geometric Adjustment of Gaussian Head Avatar

HQ3DAvatar: High Quality Implicit 3D Head Avatar

HQ3DAvatar: High Quality Controllable 3D Head Avatar

GraphAvatar: Compact Head Avatars with GNN-Generated 3D Gaussians

MixedGaussianAvatar: Realistically and Geometrically Accurate Head Avatar via Mixed 2D-3D Gaussian Splatting

MonoGaussianAvatar: Monocular Gaussian Point-based Head Avatar

GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians

AvatarMAV: Fast 3D Head Avatar Reconstruction Using Motion-Aware Neural Voxels

Learning Personalized High Quality Volumetric Head Avatars from Monocular RGB Videos

Animatable 3D Gaussian: Fast and High-Quality Reconstruction of Multiple Human Avatars

LightAvatar: Efficient Head Avatar as Dynamic Neural Light Field

Interactive Rendering of Relightable and Animatable Gaussian Avatars

GPAvatar: Generalizable and Precise Head Avatar from Image(s)

GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning

GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians