Abstract:3D facial animation is often produced by manipulating facial deformation models (or rigs), that are traditionally parameterized by expression controls. A key component that is usually overlooked is expression 'style', as in, how a particular expression is performed. Although it is common to define a semantic basis of expressions that characters can perform, most characters perform each expression in their own style. To date, style is usually entangled with the expression, and it is not possible to transfer the style of one character to another when considering facial animation. We present a new face model, based on a data-driven implicit neural physics model, that can be driven by both expression and style separately. At the core, we present a framework for learning implicit physics-based actuations for multiple subjects simultaneously, trained on a few arbitrary performance capture sequences from a small set of identities. Once trained, our method allows generalized physics-based facial animation for any of the trained identities, extending to unseen performances. Furthermore, it grants control over the animation style, enabling style transfer from one character to another or blending styles of different characters. Lastly, as a physics-based model, it is capable of synthesizing physical effects, such as collision handling, setting our method apart from conventional approaches.

What problem does this paper attempt to address?

This paper presents a new facial animation model that is based on a data-driven implicit neural physics model and can be independently driven by expressions and styles. Traditional 3D facial animation is typically created by controlling a facial deformation model (or rig), which often overlooks the "style" of expressions, i.e., how a character performs specific expressions. Each character has its own expression style, and currently, it is not easy to transfer the style of one character's facial animation to another. The new model described in the paper can learn from data of multiple identities, allowing independent control of expressions and styles. The core framework lies in their proposal of a method for learning implicit physics-driven activations for multiple subjects simultaneously, requiring only a small number of performance capture sequences from a few identities. Once trained, this method can achieve generic physics-based facial animation for any trained identity that extends to unseen performances and provides control over animation style, enabling style transfer or blending of different character styles. Additionally, as a physics-based model, it also enables synthesis of physical effects such as collision handling, setting it apart from traditional methods. The researchers also made some design decisions, including using an implicit neural network that is independent of anatomical structures to avoid constructing consistent simulation grids; using traditional blend weights parameterization for expressions and quantization of style code for style parameterization; and leveraging Lipschitz weight regularization to smooth out activations, enabling separation of expression and style spaces. The model also incorporates a robust and diverse collision model. In summary, the problem addressed in this paper is how to create a facial animation model that takes expressions and styles into account simultaneously, allowing style transfer between different characters while simulating physical effects, thereby improving the realism and flexibility of facial animation.

An Implicit Physical Face Model Driven by Expression and Style

Learning a Generalized Physical Face Model From Data

Video Tracked Facial Expression Animation

Facial Expression Animation Based on Physical Model

Anatomically Constrained Implicit Face Models

Implicit Neural Representation for Physics-driven Actuated Soft Bodies

Performance-Driven Animation of Hand-Drawn Cartoon Faces

Video-Driven Neural Physically-Based Facial Asset for Production

Personalized Speech-driven Expressive 3D Facial Animation Synthesis with Style Control

LatentHuman: Shape-and-Pose Disentangled Latent Representation for Human Bodies

Video-driven state-aware facial animation

AdaMesh: Personalized Facial Expressions and Head Poses for Adaptive Speech-Driven 3D Facial Animation

Implicit Neural Head Synthesis via Controllable Local Deformation Fields

Neuromuscular Control of the Face-Head-Neck Biomechanical Complex With Learning-Based Expression Transfer From Images and Videos

ImFace++: A Sophisticated Nonlinear 3D Morphable Face Model with Implicit Neural Representations

Rendering with style

Parametric Implicit Face Representation for Audio-Driven Facial Reenactment

Data-driven facial expression synthesis via Laplacian deformation

Expressive facial animation synthesis by learning speech coarticulation and expression spaces

DEGAS: Detailed Expressions on Full-Body Gaussian Avatars