PyGS: Large-scale Scene Representation with Pyramidal 3D Gaussian Splatting

Zipeng Wang,Dan Xu
2024-05-29
Abstract:Neural Radiance Fields (NeRFs) have demonstrated remarkable proficiency in synthesizing photorealistic images of large-scale scenes. However, they are often plagued by a loss of fine details and long rendering durations. 3D Gaussian Splatting has recently been introduced as a potent alternative, achieving both high-fidelity visual results and accelerated rendering performance. Nonetheless, scaling 3D Gaussian Splatting is fraught with challenges. Specifically, large-scale scenes grapples with the integration of objects across multiple scales and disparate viewpoints, which often leads to compromised efficacy as the Gaussians need to balance between detail levels. Furthermore, the generation of initialization points via COLMAP from large-scale dataset is both computationally demanding and prone to incomplete reconstructions. To address these challenges, we present Pyramidal 3D Gaussian Splatting (PyGS) with NeRF Initialization. Our approach represent the scene with a hierarchical assembly of Gaussians arranged in a pyramidal fashion. The top level of the pyramid is composed of a few large Gaussians, while each subsequent layer accommodates a denser collection of smaller Gaussians. We effectively initialize these pyramidal Gaussians through sampling a rapidly trained grid-based NeRF at various frequencies. We group these pyramidal Gaussians into clusters and use a compact weighting network to dynamically determine the influence of each pyramid level of each cluster considering camera viewpoint during rendering. Our method achieves a significant performance leap across multiple large-scale datasets and attains a rendering time that is over 400 times faster than current state-of-the-art approaches.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are the detail loss and excessive rendering time in existing 3D scene representation methods when dealing with large - scale scenes. Specifically: 1. **Detail loss**: Although the existing Neural Radiance Field (NeRF) methods perform well in synthesizing realistic images of large - scale scenes, due to the spectral bias of neural networks, these methods tend to be biased towards learning the low - frequency elements of the scene, making it difficult to capture complex details. 2. **Excessive rendering time**: The rendering process of NeRF methods is very slow, mainly because the large number of samplings required for volume rendering brings a heavy computational burden, which severely limits its practicality in real - time applications. 3. **Challenges in expanding 3D Gaussian Splatting (3DGS)**: Although 3DGS performs well in real - time rendering of high - quality images, it faces multiple challenges when expanding to large - scale scenes. For example, large - scale scenes contain objects and detail levels at different scales, and traditional 3DGS methods have difficulty in balancing these different scales, and the process of generating initial point clouds using COLMAP is both time - consuming and prone to being incomplete on large - scale datasets. To solve these problems, the authors propose the **Pyramidal 3D Gaussian Splatting (PyGS)** method. PyGS addresses the above challenges through the following innovations: - **Multi - scale Gaussian structure**: PyGS uses a hierarchical 3D Gaussian splatting structure. The top layer consists of a few large Gaussians, and each lower layer contains denser small Gaussians. This structure can effectively capture scene details at different scales. - **Dynamic weight network**: Through a compact weight network, the contribution of each level is dynamically adjusted according to the camera view and region complexity, thereby achieving adaptive rendering. - **Efficient initialization technique**: A coarsely - trained mesh NeRF model is used to quickly generate initial point clouds, significantly reducing pre - processing time and improving performance. Through these improvements, PyGS shows significant performance improvement and efficiency in experiments on multiple large - scale datasets, especially in terms of rendering speed, which is more than 400 times faster than the current state - of - the - art NeRF methods.