SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting

Zhijing Shao,Zhaolong Wang,Zhuang Li,Duotun Wang,Xiangru Lin,Yu Zhang,Mingming Fan,Zeyu Wang
2024-03-08
Abstract:We present SplattingAvatar, a hybrid 3D representation of photorealistic human avatars with Gaussian Splatting embedded on a triangle mesh, which renders over 300 FPS on a modern GPU and 30 FPS on a mobile device. We disentangle the motion and appearance of a virtual human with explicit mesh geometry and implicit appearance modeling with Gaussian Splatting. The Gaussians are defined by barycentric coordinates and displacement on a triangle mesh as Phong surfaces. We extend lifted optimization to simultaneously optimize the parameters of the Gaussians while walking on the triangle mesh. SplattingAvatar is a hybrid representation of virtual humans where the mesh represents low-frequency motion and surface deformation, while the Gaussians take over the high-frequency geometry and detailed appearance. Unlike existing deformation methods that rely on an MLP-based linear blend skinning (LBS) field for motion, we control the rotation and translation of the Gaussians directly by mesh, which empowers its compatibility with various animation techniques, e.g., skeletal animation, blend shapes, and mesh editing. Trainable from monocular videos for both full-body and head avatars, SplattingAvatar shows state-of-the-art rendering quality across multiple datasets.
Graphics,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper aims to address the issue of real-time rendering of high-quality, personalized, and realistic human avatars. Specifically, the paper proposes a new method called SplattingAvatar, which combines triangle mesh-based geometric representation with Gaussian Splatting techniques to achieve efficient and high-fidelity 3D human avatar rendering. The main issues the paper attempts to solve are as follows: 1. **Improving Rendering Quality**: Traditional methods enhance the quality of 3D human models by increasing the number of polygons, texture maps, and advanced hair systems, but these methods come with higher computational demands. SplattingAvatar aims to enhance the detail and realism of avatars while maintaining real-time rendering efficiency. 2. **Real-time Rendering Capability**: Existing Neural Radiance Field (NeRF) methods can capture high-frequency details but suffer from blurriness in the inverse mapping process, and the motion control based on Multi-Layer Perceptrons (MLP) overlooks the advantages of mesh representation in capturing surface deformations. SplattingAvatar addresses these issues, achieving GPU real-time rendering speeds of over 300 FPS and up to 30 FPS on mobile devices. 3. **Decoupling Motion and Appearance**: This method separates motion control from appearance modeling, allowing avatars to exhibit more natural movements and details in different scenarios. 4. **Compatibility with Various Animation Techniques**: SplattingAvatar supports not only traditional animation techniques like skeletal animation and blend shapes but also allows for mesh editing, enhancing its flexibility and adaptability in applications such as gaming, extended reality (XR), and remote presentations. In summary, SplattingAvatar aims to provide an efficient and high-quality 3D avatar rendering solution suitable for various application scenarios.