Relightable Gaussian Codec Avatars

Shunsuke Saito,Gabriel Schwartz,Tomas Simon,Junxuan Li,Giljoo Nam
2024-05-28
Abstract:The fidelity of relighting is bounded by both geometry and appearance representations. For geometry, both mesh and volumetric approaches have difficulty modeling intricate structures like 3D hair geometry. For appearance, existing relighting models are limited in fidelity and often too slow to render in real-time with high-resolution continuous environments. In this work, we present Relightable Gaussian Codec Avatars, a method to build high-fidelity relightable head avatars that can be animated to generate novel expressions. Our geometry model based on 3D Gaussians can capture 3D-consistent sub-millimeter details such as hair strands and pores on dynamic face sequences. To support diverse materials of human heads such as the eyes, skin, and hair in a unified manner, we present a novel relightable appearance model based on learnable radiance transfer. Together with global illumination-aware spherical harmonics for the diffuse components, we achieve real-time relighting with all-frequency reflections using spherical Gaussians. This appearance model can be efficiently relit under both point light and continuous illumination. We further improve the fidelity of eye reflections and enable explicit gaze control by introducing relightable explicit eye models. Our method outperforms existing approaches without compromising real-time performance. We also demonstrate real-time relighting of avatars on a tethered consumer VR headset, showcasing the efficiency and fidelity of our avatars.
Graphics,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to generate high - fidelity, re - illuminable facial avatars in a real - time environment. Specifically, it faces the following challenges: 1. **Complex material properties**: The human head is composed of a variety of complex and different materials, such as skin, hair, and eyes, which have different scattering and reflection characteristics. For example, the skin produces complex reflections due to its micro - geometric structure and has significant subsurface scattering; hair exhibits out - of - plane scattering and multiple reflections because of its transparent fiber structure; the eyes have multiple highly reflective membranes. Currently, no single material representation can accurately represent all of these characteristics, especially in a real - time environment. 2. **Accurate geometric modeling**: Accurate tracking and modeling of dynamic faces are very challenging because the deformation does not always contain enough visual markers for tracking. Especially for fine structures such as hair, existing geometric representation methods (such as mesh and volume methods) are difficult to capture their details. 3. **Real - time performance requirements**: In order to use photo - realistic avatars in major applications such as games and telecommunications, synthesis usually needs to be carried out in a real - time environment. However, as the photo - realism increases, the cost of transmitting light rays and tracking motion increases exponentially, which severely limits the algorithm design. To solve these problems, the paper makes the following three main contributions: 1. **3D - Gaussian - based drivable avatars**: Using 3D Gaussian to represent the geometric structure can efficiently render complex geometric details, such as hair strands and pores. 2. **Re - illuminable appearance model**: Based on learnable radiance transfer, it supports global illumination and reflections at all frequencies, enabling real - time re - illumination. 3. **Re - illuminable explicit eyeball model**: An explicit eyeball model is introduced to achieve high - fidelity eyeball reflections and gaze control independent of other facial movements. Through these innovations, the paper aims to construct a high - fidelity, re - illuminable facial avatar that can be rendered in real - time under illumination at any frequency.