Compact 3D Gaussian Splatting for Static and Dynamic Radiance Fields

Joo Chan Lee,Daniel Rho,Xiangyu Sun,Jong Hwan Ko,Eunbyung Park
2024-08-07
Abstract:3D Gaussian splatting (3DGS) has recently emerged as an alternative representation that leverages a 3D Gaussian-based representation and introduces an approximated volumetric rendering, achieving very fast rendering speed and promising image quality. Furthermore, subsequent studies have successfully extended 3DGS to dynamic 3D scenes, demonstrating its wide range of applications. However, a significant drawback arises as 3DGS and its following methods entail a substantial number of Gaussians to maintain the high fidelity of the rendered images, which requires a large amount of memory and storage. To address this critical issue, we place a specific emphasis on two key objectives: reducing the number of Gaussian points without sacrificing performance and compressing the Gaussian attributes, such as view-dependent color and covariance. To this end, we propose a learnable mask strategy that significantly reduces the number of Gaussians while preserving high performance. In addition, we propose a compact but effective representation of view-dependent color by employing a grid-based neural field rather than relying on spherical harmonics. Finally, we learn codebooks to compactly represent the geometric and temporal attributes by residual vector quantization. With model compression techniques such as quantization and entropy coding, we consistently show over 25x reduced storage and enhanced rendering speed compared to 3DGS for static scenes, while maintaining the quality of the scene representation. For dynamic scenes, our approach achieves more than 12x storage efficiency and retains a high-quality reconstruction compared to the existing state-of-the-art methods. Our work provides a comprehensive framework for 3D scene representation, achieving high performance, fast training, compactness, and real-time rendering. Our project page is available at <a class="link-external link-https" href="https://maincold2.github.io/c3dgs/" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper primarily aims to address two major challenges faced by Neural Radiance Fields (NeRFs) in practical applications: 1. The computational bottleneck caused by ray-based volumetric rendering, which makes it difficult for NeRF to achieve real-time rendering, especially on handheld devices or low-end GPUs. 2. The excessive storage and memory usage, due to the need for a large number of 3D Gaussians to maintain high-quality rendered images. To solve the above issues, the paper proposes a compact 3D Gaussian Splatting framework, with the main contributions including: 1. **Reducing the number of Gaussians**: By introducing a learnable masking strategy, this method can identify and remove Gaussians that contribute less to the overall rendering quality, significantly reducing the number of Gaussians. This approach not only reduces memory and storage requirements but also improves rendering speed. 2. **Compressing Gaussian attributes**: For each Gaussian's attributes, such as view-dependent color and covariance matrix, more compact representation methods are used. For example, using grid-based neural fields to efficiently represent view-dependent colors instead of storing color attributes for each Gaussian individually. 3. **Codebook representation of geometric and temporal attributes**: To further compress storage space, the paper proposes a codebook-based method to represent geometric shapes (such as scale and rotation) and temporal attributes in dynamic scenes. This method learns a set of representative geometric and temporal patterns and uses codebook indices instead of specific parameter values, achieving efficient storage. 4. **Extension to dynamic scenes**: In addition to static scenes, the paper extends this method to dynamic scenes by learning representative temporal trajectories to efficiently represent motion attributes. This achieves performance comparable to existing state-of-the-art techniques in dynamic scene representation while significantly reducing the number of parameters. In summary, the proposed method aims to improve the efficiency of 3D scene representation by reducing the number of Gaussians and compressing their attributes, achieving high-quality real-time rendering while reducing memory and storage overhead.