FlashAvatar: High-fidelity Head Avatar with Efficient Gaussian Embedding

Jun Xiang,Xuan Gao,Yudong Guo,Juyong Zhang
2024-03-30
Abstract:We propose FlashAvatar, a novel and lightweight 3D animatable avatar representation that could reconstruct a digital avatar from a short monocular video sequence in minutes and render high-fidelity photo-realistic images at 300FPS on a consumer-grade GPU. To achieve this, we maintain a uniform 3D Gaussian field embedded in the surface of a parametric face model and learn extra spatial offset to model non-surface regions and subtle facial details. While full use of geometric priors can capture high-frequency facial details and preserve exaggerated expressions, proper initialization can help reduce the number of Gaussians, thus enabling super-fast rendering speed. Extensive experimental results demonstrate that FlashAvatar outperforms existing works regarding visual quality and personalized details and is almost an order of magnitude faster in rendering speed. Project page: <a class="link-external link-https" href="https://ustc3dv.github.io/FlashAvatar/" rel="external noopener nofollow">this https URL</a>
Computer Vision and Pattern Recognition,Graphics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to achieve high - fidelity digital human head reconstruction and real - time rendering at a low cost. Specifically, the paper proposes a new method named FlashAvatar, which aims to quickly reconstruct high - quality 3D animatable heads from monocular video sequences and be able to perform high - fidelity photo - realistic image rendering at a speed of over 300 frames per second on consumer - level GPUs. The paper mainly focuses on improving the reconstruction efficiency and rendering speed while maintaining the accuracy of visual quality and personalized details. By combining non - neural 3D Gaussian fields and explicit parameterized facial models, FlashAvatar can overcome the limitations of existing methods in geometric detail modeling and rendering speed, providing an efficient and high - quality solution.