SpecGaussian with Latent Features: A High-quality Modeling of the View-dependent Appearance for 3D Gaussian Splatting

Zhiru Wang,Shiyun Xie,Chengwei Pan,Guoping Wang
DOI: https://doi.org/10.1145/3664647.3681059
2024-08-23
Abstract:Recently, the 3D Gaussian Splatting (3D-GS) method has achieved great success in novel view synthesis, providing real-time rendering while ensuring high-quality rendering results. However, this method faces challenges in modeling specular reflections and handling anisotropic appearance components, especially in dealing with view-dependent color under complex lighting conditions. Additionally, 3D-GS uses spherical harmonic to learn the color representation, which has limited ability to represent complex scenes. To overcome these challenges, we introduce Lantent-SpecGS, an approach that utilizes a universal latent neural descriptor within each 3D Gaussian. This enables a more effective representation of 3D feature fields, including appearance and geometry. Moreover, two parallel CNNs are designed to decoder the splatting feature maps into diffuse color and specular color separately. A mask that depends on the viewpoint is learned to merge these two colors, resulting in the final rendered image. Experimental results demonstrate that our method obtains competitive performance in novel view synthesis and extends the ability of 3D-GS to handle intricate scenarios with specular reflections.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### The Problem the Paper Attempts to Solve The paper "SpecGaussian with Latent Features: A High-quality Modeling of the View-dependent Appearance for 3D Gaussian Splatting" aims to address several key challenges faced by the 3D Gaussian Splatting (3D-GS) method in novel view synthesis. Specifically, these challenges include: 1. **Specular Reflection Modeling**: The 3D-GS method performs poorly in handling view-dependent colors under complex lighting conditions, especially for scenes with strong specular reflection effects. Traditional low-order Spherical Harmonics (SH) can only model subtle view-dependent phenomena and cannot capture complex specular reflections. 2. **Complex Scene Representation**: 3D-GS uses Spherical Harmonics to learn color representation, which is limited in its ability to represent complex scenes. 3. **Memory Consumption**: Retaining Spherical Harmonic coefficients for each Gaussian point is not only unnecessary but also leads to excessive memory usage and does not fully resolve the common artifacts in 3D-GS. To overcome these issues, the authors propose the SpecLatent 3D-GS method, which enhances the representation capability of 3D-GS by embedding a universal latent neural descriptor in each 3D Gaussian point. Additionally, the method designs two parallel Convolutional Neural Networks (CNNs) to decode diffuse color and specular color, respectively, and merges these two colors through a view-dependent mask to generate the final rendered image. Experimental results show that this method performs excellently in novel view synthesis tasks, especially in handling complex scenes with specular reflections.