CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes

Yang Liu,Chuanchen Luo,Zhongkai Mao,Junran Peng,Zhaoxiang Zhang
2024-11-02
Abstract:Recently, 3D Gaussian Splatting (3DGS) has revolutionized radiance field reconstruction, manifesting efficient and high-fidelity novel view synthesis. However, accurately representing surfaces, especially in large and complex scenarios, remains a significant challenge due to the unstructured nature of 3DGS. In this paper, we present CityGaussianV2, a novel approach for large-scale scene reconstruction that addresses critical challenges related to geometric accuracy and efficiency. Building on the favorable generalization capabilities of 2D Gaussian Splatting (2DGS), we address its convergence and scalability issues. Specifically, we implement a decomposed-gradient-based densification and depth regression technique to eliminate blurry artifacts and accelerate convergence. To scale up, we introduce an elongation filter that mitigates Gaussian count explosion caused by 2DGS degeneration. Furthermore, we optimize the CityGaussian pipeline for parallel training, achieving up to 10$\times$ compression, at least 25% savings in training time, and a 50% decrease in memory usage. We also established standard geometry benchmarks under large-scale scenes. Experimental results demonstrate that our method strikes a promising balance between visual quality, geometric accuracy, as well as storage and training costs. The project page is available at <a class="link-external link-https" href="https://dekuliutesla.github.io/CityGaussianV2/" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper attempts to address the problem of efficient and geometrically accurate 3D reconstruction in large-scale scenarios. Specifically, the paper proposes the CityGaussianV2 method, which aims to overcome the challenges of geometric accuracy and efficiency encountered by existing methods when dealing with large-scale complex scenes. ### Main Issues: 1. **Geometric Accuracy Issue**: Existing 3D Gaussian Splatting (3DGS) methods have problems with blurriness and inaccuracy in representing surfaces, especially in large-scale and complex scenes. 2. **Efficiency Issue**: Existing methods face high memory consumption and long training times when training and rendering in large-scale scenes. 3. **Scalability Issue**: Existing methods encounter memory overflow and performance degradation during parallel training due to the excessive proliferation of high-gradient points. ### Solutions: 1. **Optimization Mechanisms**: - **Depth Supervision**: Estimate inverse depth through Depth-Anything V2 and align it with the predicted inverse depth to improve geometric accuracy. - **Elongation Filter**: Prevent excessive proliferation of high-gradient points during parallel training by evaluating the elongation rate of each surfel, thus avoiding memory overflow. - **Decomposed-Gradient-based Densification**: Prioritize the gradient of SSIM loss and introduce the Decomposed-Gradient-based Densification (DGD) strategy to accelerate convergence and eliminate blurriness. 2. **Parallel Training Pipeline**: - **Simplified Pipeline**: Remove time-consuming post-pruning and distillation steps, use SH degree 2 from the start, and reduce memory and storage requirements. - **Contribution Pruning**: During block-level tuning, prune based on the single-view contribution of each Gaussian point, automatically removing redundant and unobserved points. - **Vector Tree Quantization**: Apply vector tree quantization to 2DGS, further compressing storage requirements. 3. **Geometric Evaluation Protocol**: - **Standardized Evaluation**: Adopt the evaluation protocol from the Tanks and Temple (TnT) dataset, including point cloud alignment, resampling, volume cropping, and F1 score measurement. - **Visible Frequency Estimation**: Estimate the cropping volume by checking the visible frequency of each point, excluding points that are rarely observed to ensure stability and consistency in evaluation. ### Experimental Results: Experimental results show that CityGaussianV2 performs excellently in terms of geometric accuracy and efficiency in large-scale scenes, significantly reducing storage and training costs while maintaining high-quality visual effects.