DEGAS: Detailed Expressions on Full-Body Gaussian Avatars

Zhijing Shao,Duotun Wang,Qing-Yao Tian,Yao-Dong Yang,Hengyu Meng,Zeyu Cai,Bo Dong,Yu Zhang,Kang Zhang,Zeyu Wang

2024-08-20

Abstract:Although neural rendering has made significant advancements in creating lifelike, animatable full-body and head avatars, incorporating detailed expressions into full-body avatars remains largely unexplored. We present DEGAS, the first 3D Gaussian Splatting (3DGS)-based modeling method for full-body avatars with rich facial expressions. Trained on multiview videos of a given subject, our method learns a conditional variational autoencoder that takes both the body motion and facial expression as driving signals to generate Gaussian maps in the UV layout. To drive the facial expressions, instead of the commonly used 3D Morphable Models (3DMMs) in 3D head avatars, we propose to adopt the expression latent space trained solely on 2D portrait images, bridging the gap between 2D talking faces and 3D avatars. Leveraging the rendering capability of 3DGS and the rich expressiveness of the expression latent space, the learned avatars can be reenacted to reproduce photorealistic rendering images with subtle and accurate facial expressions. Experiments on an existing dataset and our newly proposed dataset of full-body talking avatars demonstrate the efficacy of our method. We also propose an audio-driven extension of our method with the help of 2D talking faces, opening new possibilities to interactive AI agents.

Computer Vision and Pattern Recognition,Graphics

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to add rich facial expression expressions to full - body three - dimensional virtual avatars. Specifically, existing neural rendering techniques have been able to create realistic and animatable full - body or head virtual avatars, but integrating detailed facial expressions in full - body virtual avatars remains an under - explored area. The paper proposes DEGAS (Detailed Expressions on Full - Body Gaussian Avatars), which is the first modeling method based on 3D Gaussian Splatting (3DGS) for generating full - body virtual avatars with rich facial expressions. Through multi - view video training, this method can learn a conditional variational auto - encoder, which takes both body movements and facial expressions as driving signals to generate Gaussian maps in the UV layout. To drive facial expressions, the paper proposes to adopt an expression latent space trained only from 2D portrait images, thereby bridging the gap between 2D talking faces and 3D virtual avatars. This enables the learned virtual avatars to be re - enacted to reproduce photo - realistic rendered images with subtle and accurate facial expressions. The experimental results demonstrate the effectiveness of this method, especially its performance on existing datasets and the newly proposed full - body talking virtual avatar dataset. In addition, the paper also proposes an audio - driven method extension, which, with the help of 2D talking face technology, opens up new possibilities for interactive AI agents.

DEGAS: Detailed Expressions on Full-Body Gaussian Avatars

Expressive Whole-Body 3D Gaussian Avatar

Expressive Gaussian Human Avatars from Monocular RGB Video

Gaussian Deja-vu: Creating Controllable 3D Gaussian Head-Avatars with Enhanced Generalization and Personalization Abilities

DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion

GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians

GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning

FAGhead: Fully Animate Gaussian Head from Monocular Videos

E^3Gen: Efficient, Expressive and Editable Avatars Generation

$E^{3}$Gen: Efficient, Expressive and Editable Avatars Generation

Drivable 3D Gaussian Avatars

GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians

GaussianSpeech: Audio-Driven Gaussian Avatars

Generalizable and Animatable Gaussian Head Avatar

Animatable and Relightable Gaussians for High-fidelity Human Avatar Modeling

AvatarReX: Real-time Expressive Full-body Avatars

XAGen: 3D Expressive Human Avatars Generation

ExpAvatar: High-Fidelity Avatar Generation of Unseen Expressions with 3D Face Priors

Gaussian Head & Shoulders: High Fidelity Neural Upper Body Avatars with Anchor Gaussian Guided Texture Warping

3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting