Abstract:Synthesis and reconstruction of 3D human head has gained increasing interests in computer vision and computer graphics recently. Existing state-of-the-art 3D generative adversarial networks (GANs) for 3D human head synthesis are either limited to near-frontal views or hard to preserve 3D consistency in large view angles. We propose PanoHead, the first 3D-aware generative model that enables high-quality view-consistent image synthesis of full heads in $360^\circ$ with diverse appearance and detailed geometry using only in-the-wild unstructured images for training. At its core, we lift up the representation power of recent 3D GANs and bridge the data alignment gap when training from in-the-wild images with widely distributed views. Specifically, we propose a novel two-stage self-adaptive image alignment for robust 3D GAN training. We further introduce a tri-grid neural volume representation that effectively addresses front-face and back-head feature entanglement rooted in the widely-adopted tri-plane formulation. Our method instills prior knowledge of 2D image segmentation in adversarial learning of 3D neural scene structures, enabling compositable head synthesis in diverse backgrounds. Benefiting from these designs, our method significantly outperforms previous 3D GANs, generating high-quality 3D heads with accurate geometry and diverse appearances, even with long wavy and afro hairstyles, renderable from arbitrary poses. Furthermore, we show that our system can reconstruct full 3D heads from single input images for personalized realistic 3D avatars.

What problem does this paper attempt to address?

The paper mainly aims to address the following issues: 1. **Full 3D Head Synthesis**: Existing 3D Generative Adversarial Networks (GANs) perform well for head synthesis from near-frontal views but struggle to maintain 3D consistency with larger view changes. Therefore, the researchers propose a novel 3D-aware generative model called PanoHead, which can achieve high-quality, view-consistent full head image synthesis, covering 360-degree views, and can handle diverse appearances and detailed geometric features. 2. **Background and Foreground Separation**: In traditional methods, the foreground (head) and background are easily confused, leading to issues when synthesizing images from large-angle views. To solve this problem, the researchers introduced a foreground-aware tri-discriminator to separate foreground head modeling from background synthesis. 3. **Improvement of 3D Representation**: The three-plane representation method has projection ambiguities under 360-degree views, leading to phenomena like "mirror faces." The paper proposes a new 3D tri-grid volume representation to address this issue, which can improve expressive power while maintaining efficiency. 4. **Camera Alignment Challenges**: For rear head images in the wild, obtaining accurate camera extrinsic parameters is extremely difficult, and there is an alignment gap between these images and frontal images, leading to noisy appearances and suboptimal head geometry. The researchers propose a two-stage alignment scheme and a camera adaptive module to effectively tackle these challenges. In summary, the main contribution of this paper is the proposal of a 3D-aware GAN framework called PanoHead, which can be trained from unstructured images in the wild and achieve high-fidelity full head image synthesis, including detailed geometric structures, and can render from any 360-degree view. Additionally, the paper introduces new techniques for separating foreground and background, improved 3D representation methods, and effective image alignment strategies.

PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360$^{\circ}$

SphereHead: Stable 3D Full-head Synthesis with Spherical Tri-plane Representation

Head360: Learning a Parametric 3D Full-Head for Free-View Synthesis in 360°

Head3D: Complete 3D Head Generation via Tri-plane Feature Distillation

OmniAvatar: Geometry-Guided Controllable 3D Head Synthesis

Learning Full-Head 3D GANs from a Single-View Portrait Dataset

GPHM: Gaussian Parametric Head Model for Monocular Head Avatar Reconstruction

Towards Native Generative Model for 3D Head Avatar

3D Gaussian Parametric Head Model

3DPortraitGAN: Learning One-Quarter Headshot 3D GANs from a Single-View Portrait Dataset with Diverse Body Poses

GGHead: Fast and Generalizable 3D Gaussian Heads

GANHead: Towards Generative Animatable Neural Head Avatars

TimeWalker: Personalized Neural Space for Lifelong Head Avatars

GaussianHead: High-fidelity Head Avatars with Learnable Gaussian Derivation

HeadGAP: Few-shot 3D Head Avatar via Generalizable Gaussian Priors

Dual Encoder GAN Inversion for High-Fidelity 3D Head Reconstruction from Single Images

Portrait4D: Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data

Single Image, Any Face: Generalisable 3D Face Generation

Prior-Guided Multi-View 3D Head Reconstruction

HQ3DAvatar: High Quality Implicit 3D Head Avatar

Hybrid Approach for 3D Head Reconstruction: Using Neural Networks and Visual Geometry