Abstract:Recent advancements in 3D Gaussian Splatting (3DGS) have unlocked significant potential for modeling 3D head avatars, providing greater flexibility than mesh-based methods and more efficient rendering compared to NeRF-based approaches. Despite these advancements, the creation of controllable 3DGS-based head avatars remains time-intensive, often requiring tens of minutes to hours. To expedite this process, we here introduce the ``Gaussian Déjà-vu" framework, which first obtains a generalized model of the head avatar and then personalizes the result. The generalized model is trained on large 2D (synthetic and real) image datasets. This model provides a well-initialized 3D Gaussian head that is further refined using a monocular video to achieve the personalized head avatar. For personalizing, we propose learnable expression-aware rectification blendmaps to correct the initial 3D Gaussians, ensuring rapid convergence without the reliance on neural networks. Experiments demonstrate that the proposed method meets its objectives. It outperforms state-of-the-art 3D Gaussian head avatars in terms of photorealistic quality as well as reduces training time consumption to at least a quarter of the existing methods, producing the avatar in minutes.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to create efficient, high - quality and controllable 3D Gaussian Head Avatars. Specifically, the paper focuses on the following points: 1. **Efficiency**: Existing 3D avatar generation methods are often time - consuming in the training and rendering processes, taking dozens of minutes or even several hours. The method proposed in the paper aims to significantly reduce the training time and shorten the generation time to within a few minutes. 2. **Quality**: Existing methods have deficiencies in terms of realism generation, especially when dealing with complex scenes such as hair. The method proposed in the paper improves the realism of the generated avatars by improving the 3D Gaussian initialization and optimization processes. 3. **Controllability**: Existing methods have limitations in controlling facial expressions, head postures and camera perspectives. The method proposed in the paper achieves fine - grained control over these aspects by introducing learnable expression - aware rectification blendmaps. To achieve the above goals, the paper proposes the "Gaussian Déjà - vu" framework, which mainly consists of two stages: 1. **Single - image reconstruction**: First, a general 3D Gaussian avatar model is trained using a large number of 2D image datasets (including synthetic and real images). This model can quickly initialize 3D Gaussian avatars and provide a good starting point for subsequent personalized optimization. 2. **Monocular - video - based optimization**: Then, monocular videos are used to further optimize the initial 3D Gaussian avatars to make them more in line with the characteristics of specific individuals. By introducing learnable rectification blendmaps, this method can converge quickly without relying on neural networks, ensuring efficient personalized optimization. Experimental results show that this method is not only superior to existing methods in terms of generation time and realism, but also can maintain consistent high - quality output from different perspectives.

Gaussian Deja-vu: Creating Controllable 3D Gaussian Head-Avatars with Enhanced Generalization and Personalization Abilities

GaussianHead: High-fidelity Head Avatars with Learnable Gaussian Derivation

GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians

3D Gaussian Parametric Head Model

HeadGAP: Few-shot 3D Head Avatar via Generalizable Gaussian Priors

MonoGaussianAvatar: Monocular Gaussian Point-based Head Avatar

Generalizable and Animatable Gaussian Head Avatar

GPHM: Gaussian Parametric Head Model for Monocular Head Avatar Reconstruction

GaussianHeads: End-to-End Learning of Drivable Gaussian Head Avatars from Coarse-to-fine Representations

3D Gaussian Blendshapes for Head Avatar Animation

FAGhead: Fully Animate Gaussian Head from Monocular Videos

GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning

MixedGaussianAvatar: Realistically and Geometrically Accurate Head Avatar via Mixed 2D-3D Gaussian Splatting

GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians

PSAvatar: A Point-based Shape Model for Real-Time Head Avatar Animation with 3D Gaussian Splatting

HHAvatar: Gaussian Head Avatar with Dynamic Hairs

Gaussian Head & Shoulders: High Fidelity Neural Upper Body Avatars with Anchor Gaussian Guided Texture Warping

GazeGaussian: High-Fidelity Gaze Redirection with 3D Gaussian Splatting

GaussianStyle: Gaussian Head Avatar via StyleGAN

Gaussian3Diff: 3D Gaussian Diffusion for 3D Full Head Synthesis and Editing