Abstract:3D Gaussian Splatting (3DGS) is increasingly popular for 3D reconstruction due to its superior visual quality and rendering speed. However, 3DGS training currently occurs on a single GPU, limiting its ability to handle high-resolution and large-scale 3D reconstruction tasks due to memory constraints. We introduce Grendel, a distributed system designed to partition 3DGS parameters and parallelize computation across multiple GPUs. As each Gaussian affects a small, dynamic subset of rendered pixels, Grendel employs sparse all-to-all communication to transfer the necessary Gaussians to pixel partitions and performs dynamic load balancing. Unlike existing 3DGS systems that train using one camera view image at a time, Grendel supports batched training with multiple views. We explore various optimization hyperparameter scaling strategies and find that a simple sqrt(batch size) scaling rule is highly effective. Evaluations using large-scale, high-resolution scenes show that Grendel enhances rendering quality by scaling up 3DGS parameters across multiple GPUs. On the Rubble dataset, we achieve a test PSNR of 27.28 by distributing 40.4 million Gaussians across 16 GPUs, compared to a PSNR of 26.28 using 11.2 million Gaussians on a single GPU. Grendel is an open-source project available at: <a class="link-external link-https" href="https://github.com/nyu-systems/Grendel-GS" rel="external noopener nofollow">this https URL</a>

What problem does this paper attempt to address?

The paper primarily addresses the issue of single GPU memory limitations faced by 3D Gaussian Splatting (3DGS) technology when handling large-scale, high-resolution 3D reconstruction tasks. Specifically, the research team designed a distributed training system named Grendel, aimed at extending the training capabilities of 3DGS through multi-GPU parallel computing. The main problems addressed by the paper include: 1. **Memory Limitation**: Current 3DGS training is typically constrained by the memory capacity of a single GPU, limiting its ability to handle high-resolution and large-scale scenes. 2. **Computational Efficiency**: Due to the computational bottleneck of a single GPU, the processing efficiency for large scenes is low. 3. **Support for Batch Training**: Traditional 3DGS training methods process only one view image at a time, whereas Grendel supports batch training, processing multiple view images simultaneously to improve efficiency. To overcome these challenges, Grendel employs the following strategies: - **Distributed Parameter Storage**: Distributes the parameters of 3DGS (such as position, shape, etc.) across multiple GPUs. - **Hybrid Parallelism**: Uses different parallel strategies at different stages, such as Gaussian-based parallelism and pixel-based parallelism. - **Sparse All-to-All Communication**: Utilizes the spatial locality characteristics of 3DGS to transmit only the Gaussian splats related to specific pixel blocks, reducing communication overhead. - **Dynamic Load Balancing**: Reallocates pixels based on the computation time from previous training iterations to balance the workload across different GPUs. - **Batch Training Optimization**: Proposes a simple square root rule to adjust learning rate and momentum parameters, maintaining good training performance even with increased batch sizes. Through these methods, Grendel effectively scales 3DGS training across multiple GPUs, achieving efficient rendering in large-scale, high-resolution scenes, and handling larger datasets than a single GPU can manage. Experimental results show that on the "Rubble" dataset, using 16 GPUs to distribute 40.4 million Gaussian splats, Grendel achieved a peak signal-to-noise ratio (PSNR) of 27.28, significantly surpassing the result of using a single GPU (11.2 million Gaussian splats achieving a PSNR of 26.28). Additionally, even for smaller scenes (such as the "Train" scene), Grendel was able to achieve speed improvements without compromising the quality of the test results (PSNR).

On Scaling Up 3D Gaussian Splatting Training

Efficient Density Control for 3D Gaussian Splatting

DOGS: Distributed-Oriented Gaussian Splatting for Large-Scale 3D Reconstruction Via Gaussian Consensus

GigaGS: Scaling up Planar-Based 3D Gaussians for Large Scene Surface Reconstruction

Multi-Scale 3D Gaussian Splatting for Anti-Aliased Rendering

Balanced 3DGS: Gaussian-wise Parallelism Rendering with Fine-Grained Tiling

RetinaGS: Scalable Training for Dense Scene Rendering with Billion-Scale 3D Gaussians

Taming 3DGS: High-Quality Radiance Fields with Limited Resources

CityGaussian: Real-time High-quality Large-Scale Scene Rendering with Gaussians

Pixel-GS: Density Control with Pixel-aware Gradient for 3D Gaussian Splatting

EfficientGS: Streamlining Gaussian Splatting for Large-Scale High-Resolution Scene Representation

AbsGS: Recovering Fine Details for 3D Gaussian Splatting

Speedy-Splat: Fast 3D Gaussian Splatting with Sparse Pixels and Sparse Primitives

PyGS: Large-scale Scene Representation with Pyramidal 3D Gaussian Splatting

A Hierarchical 3D Gaussian Representation for Real-Time Rendering of Very Large Datasets

Unbounded-GS: Extending 3D Gaussian Splatting with Hybrid Representation for Unbounded Large-Scale Scene Reconstruction

GaussianPro: 3D Gaussian Splatting with Progressive Propagation

GS-Net: Generalizable Plug-and-Play 3D Gaussian Splatting Module

DyGASR: Dynamic Generalized Exponential Splatting with Surface Alignment for Accelerated 3D Mesh Reconstruction