Abstract:Neural rendering methods have significantly advanced photo-realistic 3D scene rendering in various academic and industrial applications. The recent 3D Gaussian Splatting method has achieved the state-of-the-art rendering quality and speed combining the benefits of both primitive-based representations and volumetric representations. However, it often leads to heavily redundant Gaussians that try to fit every training view, neglecting the underlying scene geometry. Consequently, the resulting model becomes less robust to significant view changes, texture-less area and lighting effects. We introduce Scaffold-GS, which uses anchor points to distribute local 3D Gaussians, and predicts their attributes on-the-fly based on viewing direction and distance within the view frustum. Anchor growing and pruning strategies are developed based on the importance of neural Gaussians to reliably improve the scene coverage. We show that our method effectively reduces redundant Gaussians while delivering high-quality rendering. We also demonstrates an enhanced capability to accommodate scenes with varying levels-of-detail and view-dependent observations, without sacrificing the rendering speed.

What problem does this paper attempt to address?

### Problems Addressed by the Paper This paper aims to address the issues of redundancy and lack of robustness in existing 3D Gaussian point rendering methods (such as 3D-GS) when dealing with complex scenes. Specifically, while 3D-GS methods perform well in terms of rendering quality and speed, they have the following problems: 1. **Redundancy**: To adapt to each training viewpoint, 3D-GS methods often generate a large number of redundant Gaussian spheres, making the model bulky and difficult to scale to large and complex scenes. 2. **Lack of Robustness**: Since 3D-GS methods mainly rely on training viewpoints, they perform poorly when handling new viewpoints, textureless regions, and lighting effects, lacking sufficient utilization of scene structure. To solve these problems, the authors propose the Scaffold-GS method, which introduces anchors to distribute local 3D Gaussian spheres and dynamically predicts the properties of these Gaussian spheres based on view direction and distance. This method not only reduces redundant Gaussian spheres but also improves the model's robustness and efficiency in handling different levels of detail and view-dependent observations. ### Main Contributions 1. **Hierarchical and Region-Aware Scene Representation**: By initializing anchors from a sparse voxel grid, the distribution of local 3D Gaussian spheres is guided, forming a hierarchical and region-aware scene representation. 2. **View-Dependent Neural Gaussian Sphere Prediction**: Within the view cone, the properties of neural Gaussian spheres are dynamically predicted from each anchor to adapt to different view directions and distances, achieving more robust novel view synthesis. 3. **Reliable Anchor Growth and Pruning Strategy**: The predicted neural Gaussian spheres are used to improve the growth and pruning operations of anchors, enhancing scene coverage. ### Experimental Results Through extensive experiments, the authors demonstrate the superior performance of the Scaffold-GS method on multiple datasets, especially when handling complex outdoor scenes and detailed indoor environments. Compared to 3D-GS, Scaffold-GS significantly reduces storage requirements while maintaining similar rendering speed and achieving better visual quality. Specifically, the improvements are shown in the following aspects: - **Visual Quality**: In various challenging scenes (such as textureless regions, lighting effects, fine geometric structures, etc.), Scaffold-GS exhibits higher visual quality with less blurring and needle-like artifacts. - **Storage Efficiency**: The storage requirements of Scaffold-GS are significantly reduced, with storage size decreased by several times. - **View Adaptability**: Scaffold-GS shows stronger adaptability and generalization when handling multi-scale scenes and new viewpoints. ### Conclusion By introducing anchors and view-dependent neural Gaussian sphere prediction, the Scaffold-GS method effectively addresses the issues of redundancy and lack of robustness in 3D-GS methods, providing a new solution for efficient and high-quality rendering of complex scenes.

Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering

Efficient Density Control for 3D Gaussian Splatting

PEP-GS: Perceptually-Enhanced Precise Structured 3D Gaussians for View-Adaptive Rendering

Octree-GS: Towards Consistent Real-time Rendering with LOD-Structured 3D Gaussians

Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene Reconstruction

SAGS: Structure-Aware 3D Gaussian Splatting

4D Scaffold Gaussian Splatting for Memory Efficient Dynamic Scene Reconstruction

Bootstrap 3D Reconstructed Scenes from 3D Gaussian Splatting

Unbounded-GS: Extending 3D Gaussian Splatting with Hybrid Representation for Unbounded Large-Scale Scene Reconstruction

GSDF: 3DGS Meets SDF for Improved Rendering and Reconstruction

Optimizing 3D Gaussian Splatting for Sparse Viewpoint Scene Reconstruction

Recent Advances in 3D Gaussian Splatting

GeoGaussian: Geometry-aware Gaussian Splatting for Scene Rendering

Generalizable Human Gaussians for Sparse View Synthesis

Superpoint Gaussian Splatting for Real-Time High-Fidelity Dynamic Scene Reconstruction

SparseGS: Real-Time 360° Sparse View Synthesis using Gaussian Splatting

Taming 3DGS: High-Quality Radiance Fields with Limited Resources

WE-GS: An In-the-wild Efficient 3D Gaussian Representation for Unconstrained Photo Collections

A Hierarchical 3D Gaussian Representation for Real-Time Rendering of Very Large Datasets

HybridGS: Decoupling Transients and Statics with 2D and 3D Gaussian Splatting