An Implicit Physical Face Model Driven by Expression and Style

Lingchen Yang,Gaspard Zoss,Prashanth Chandran,Paulo Gotardo,Markus Gross,Barbara Solenthaler,Eftychios Sifakis,Derek Bradley
DOI: https://doi.org/10.1145/3610548.3618156
2024-01-27
Abstract:3D facial animation is often produced by manipulating facial deformation models (or rigs), that are traditionally parameterized by expression controls. A key component that is usually overlooked is expression 'style', as in, how a particular expression is performed. Although it is common to define a semantic basis of expressions that characters can perform, most characters perform each expression in their own style. To date, style is usually entangled with the expression, and it is not possible to transfer the style of one character to another when considering facial animation. We present a new face model, based on a data-driven implicit neural physics model, that can be driven by both expression and style separately. At the core, we present a framework for learning implicit physics-based actuations for multiple subjects simultaneously, trained on a few arbitrary performance capture sequences from a small set of identities. Once trained, our method allows generalized physics-based facial animation for any of the trained identities, extending to unseen performances. Furthermore, it grants control over the animation style, enabling style transfer from one character to another or blending styles of different characters. Lastly, as a physics-based model, it is capable of synthesizing physical effects, such as collision handling, setting our method apart from conventional approaches.
Computer Vision and Pattern Recognition,Graphics
What problem does this paper attempt to address?
This paper presents a new facial animation model that is based on a data-driven implicit neural physics model and can be independently driven by expressions and styles. Traditional 3D facial animation is typically created by controlling a facial deformation model (or rig), which often overlooks the "style" of expressions, i.e., how a character performs specific expressions. Each character has its own expression style, and currently, it is not easy to transfer the style of one character's facial animation to another. The new model described in the paper can learn from data of multiple identities, allowing independent control of expressions and styles. The core framework lies in their proposal of a method for learning implicit physics-driven activations for multiple subjects simultaneously, requiring only a small number of performance capture sequences from a few identities. Once trained, this method can achieve generic physics-based facial animation for any trained identity that extends to unseen performances and provides control over animation style, enabling style transfer or blending of different character styles. Additionally, as a physics-based model, it also enables synthesis of physical effects such as collision handling, setting it apart from traditional methods. The researchers also made some design decisions, including using an implicit neural network that is independent of anatomical structures to avoid constructing consistent simulation grids; using traditional blend weights parameterization for expressions and quantization of style code for style parameterization; and leveraging Lipschitz weight regularization to smooth out activations, enabling separation of expression and style spaces. The model also incorporates a robust and diverse collision model. In summary, the problem addressed in this paper is how to create a facial animation model that takes expressions and styles into account simultaneously, allowing style transfer between different characters while simulating physical effects, thereby improving the realism and flexibility of facial animation.