Abstract:Video-driven 3D facial animation transfer aims to drive avatars to reproduce the expressions of actors. Existing methods have achieved remarkable results by constraining both geometric and perceptual consistency. However, geometric constraints (like those designed on facial landmarks) are insufficient to capture subtle emotions, while expression features trained on classification tasks lack fine granularity for complex emotions. To address this, we propose \textbf{FreeAvatar}, a robust facial animation transfer method that relies solely on our learned expression representation. Specifically, FreeAvatar consists of two main components: the expression foundation model and the facial animation transfer model. In the first component, we initially construct a facial feature space through a face reconstruction task and then optimize the expression feature space by exploring the similarities among different expressions. Benefiting from training on the amounts of unlabeled facial images and re-collected expression comparison dataset, our model adapts freely and effectively to any in-the-wild input facial images. In the facial animation transfer component, we propose a novel Expression-driven Multi-avatar Animator, which first maps expressive semantics to the facial control parameters of 3D avatars and then imposes perceptual constraints between the input and output images to maintain expression consistency. To make the entire process differentiable, we employ a trained neural renderer to translate rig parameters into corresponding images. Furthermore, unlike previous methods that require separate decoders for each avatar, we propose a dynamic identity injection module that allows for the joint training of multiple avatars within a single network.

Semi-supervised video-driven facial animation transfer for production

Sketch Based Multi-source 3D Animation Transfer

Accelerating facial motion capture with video-driven animation transfer

Video-driven state-aware facial animation

Video Tracked Facial Expression Animation

Performance-Driven Animation of Hand-Drawn Cartoon Faces

Controllable high-fidelity facial performance transfer

Neuromuscular Control of the Face-Head-Neck Biomechanical Complex With Learning-Based Expression Transfer From Images and Videos

FreeAvatar: Robust 3D Facial Animation Transfer by Learning an Expression Foundation Model

Versatile Face Animator: Driving Arbitrary 3D Facial Avatar in RGBD Space

Non-corresponding and topology-free 3D face expression transfer

Video-Driven Neural Physically-Based Facial Asset for Production

Transferring of Speech Movements from Video to 3D Face Space

Displaced Dynamic Expression Regression for Real-Time Facial Tracking and Animation

One-shot Human Motion Transfer via Occlusion-Robust Flow Prediction and Neural Texturing

Production-Level Facial Performance Capture Using Deep Convolutional Neural Networks

Real-Time Facial Expression Mapping for High Resolution 3D Meshes

High-Fidelity Neural Human Motion Transfer from Monocular Video

A data-driven approach for facial expression synthesis in video

Vision Based Speech Animation Transferring with Underlying Anatomical Structure

Human Motion Transfer With 3D Constraints and Detail Enhancement