Abstract:Accurate stabilization of facial motion is essential for applications in photoreal avatar construction for 3D games, virtual reality, movies, and training data collection. For the latter, stabilization must work automatically for the general population with people of varying morphology. Distinguishing rigid skull motion from facial expressions is critical since misalignment between skull motion and facial expressions can lead to animation models that are hard to control and can not fit natural motion. Existing methods struggle to work with sparse sets of very different expressions, such as when combining multiple units from the Facial Action Coding System (FACS). Certain approaches are not robust enough, some depend on motion data to find stable points, while others make one-for-all invalid physiological assumptions. In this paper, we leverage recent advances in neural signed distance fields and differentiable isosurface meshing to compute skull stabilization rigid transforms directly on unstructured triangle meshes or point clouds, significantly enhancing accuracy and robustness. We introduce the concept of a stable hull as the surface of the boolean intersection of stabilized scans, analogous to the visual hull in shape-from-silhouette and the photo hull from space carving. This hull resembles a skull overlaid with minimal soft tissue thickness, upper teeth are automatically included. Our skull carving algorithm simultaneously optimizes the stable hull shape and rigid transforms to get accurate stabilization of complex expressions for large diverse sets of people, outperforming existing methods.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the problem of accurately and stably facial movement in facial animation, especially in applications such as 3D games, virtual reality, movies, and training data collection. Specifically, the article focuses on how to distinguish the rigid movement of the skull from the non - rigid deformation of facial expressions, thereby ensuring the controllability and naturalness of the animation model. ### Problem Background 1. **Application Scenarios**: - High - resolution realistic avatar capture is crucial for 3D game characters, face - to - face conversations in virtual reality, and digital doubles in movies. - Automatically generating high - quality 3D facial animations remains a challenge, especially when dealing with different - shaped populations. 2. **Limitations of Existing Methods**: - Existing methods perform poorly when dealing with sparse and complex sets of facial expressions (such as combining multiple Facial Action Coding System (FACS) units). - Some methods rely on 4D time - series data to find stable feature points, while other methods are based on unreasonable physiological assumptions, such as assuming that the skin thickness is the same on all faces. - These methods are not robust enough when dealing with diverse populations including various races, ages, and BMIs (Body Mass Index). ### Solutions Proposed in the Paper To solve the above problems, the author proposes a new algorithm - the **Skull Carving Algorithm**, and its main contributions are as follows: 1. **Concept of Stable Hull**: - Introduces the concept of "stable hull", which is a surface calculated by Boolean intersection, similar to the visual hull and photo hull in computer vision. - The stable hull is the intersection surface of all stable scans, similar to the skull covering the minimum soft tissue thickness, and automatically includes the upper teeth. 2. **Optimization Methods**: - Utilizes recent Neural Signed Distance Fields (SDFs) and differentiable isosurface meshing techniques to directly calculate the rigid transformation for skull stabilization on unstructured triangular meshes or point clouds. - By optimizing the shape of the stable hull and the rigid transformation, accurate stabilization of complex expressions is achieved, which is suitable for large - scale diverse populations. 3. **Technical Details**: - Converts the original 3D scan into an SDF and processes it using the Fast Winding Number and Eikonal equation solver. - Models rigid transformations using Dual Quaternions and solves the nonlinear optimization problem by the gradient descent method. - Adopts a two - step pattern - tracking initialization strategy, first optimizing with a larger histogram interval and then fine - tuning with a smaller interval. ### Experimental Results Experiments on a database of 32 people show that the Skull Carving Algorithm performs excellently in expression scans with visible upper teeth, significantly outperforming other methods, including FLAME, ICP, and pattern - tracking. In particular, the Skull Carving Algorithm shows higher accuracy and robustness in a diverse database. ### Conclusion This paper proposes a new method for skull stabilization in facial animation, which solves the limitations of existing methods in dealing with sparse and complex FACS unit expressions. Although there is still room for improvement, the Skull Carving Algorithm shows higher accuracy in diverse populations and does not need to rely on specific morphological assumptions. ### Formula Display - **SDF Approximation Formula**: \[ \phi_{\theta_i}(x)\approx SDF_i(x),\quad\forall x\in\Omega \] - **Stable Hull Function**: \[ S(Q)=\gamma\left(\max_{i\in[N]}\phi_{\theta_i}(q_iX_rq_i)\right) \] - **Optimization Objective**: \[ \arg\min_{Q}\frac{1}{N}\sum_{i = 1}^{N}\psi\left(\phi_{\theta_i}\left(q_iS(Q)q_i\right)\right) \]

A Theory of Stabilization by Skull Carving

Learning to Stabilize Faces

Skull-to-Face: Anatomy-Guided 3D Facial Reconstruction and Editing

Statistical skull models from 3D X-ray images

Quantum decoherence of the damped harmonic oscillator

Accurate face rig approximation with deep differential subspace reconstruction

A facial reconstruction method based on new mesh deformation techniques

Democratizing the Creation of Animatable Facial Avatars

Implicit Neural Head Synthesis via Controllable Local Deformation Fields

A statistical shape modeling approach for predicting subject-specific human skull from head surface

SurFhead: Affine Rig Blending for Geometrically Accurate 2D Gaussian Surfel Head Avatars

Interactive 3D Face Stylization Using Sculptural Abstraction

High-fidelity facial and speech animation for VR HMDs

A novel approach to craniofacial analysis using automated 3D landmarking of the skull

TriHuman: A Real-time and Controllable Tri-plane Representation for Detailed Human Geometry and Appearance Synthesis

HACK: Learning a Parametric Head and Neck Model for High-fidelity Animation

Automated 3D Landmarking of the Skull: A Novel Approach for Craniofacial Analysis

TriHuman : A Real-time and Controllable Tri-plane Representation for Detailed Human Geometry and Appearance Synthesis

General Automatic Human Shape and Motion Capture Using Volumetric Contour Cues

A Methodology to Create 3D Body Models in Motion

A New Algorithm for 3d Facial Model Reconstruction and Its Application in Vr