SparseGS: Real-Time 360° Sparse View Synthesis using Gaussian Splatting

Haolin Xiong,Sairisheek Muttukuru,Rishi Upadhyay,Pradyumna Chari,Achuta Kadambi
2024-05-13
Abstract:The problem of novel view synthesis has grown significantly in popularity recently with the introduction of Neural Radiance Fields (NeRFs) and other implicit scene representation methods. A recent advance, 3D Gaussian Splatting (3DGS), leverages an explicit representation to achieve real-time rendering with high-quality results. However, 3DGS still requires an abundance of training views to generate a coherent scene representation. In few shot settings, similar to NeRF, 3DGS tends to overfit to training views, causing background collapse and excessive floaters, especially as the number of training views are reduced. We propose a method to enable training coherent 3DGS-based radiance fields of 360-degree scenes from sparse training views. We integrate depth priors with generative and explicit constraints to reduce background collapse, remove floaters, and enhance consistency from unseen viewpoints. Experiments show that our method outperforms base 3DGS by 6.4% in LPIPS and by 12.2% in PSNR, and NeRF-based methods by at least 17.6% in LPIPS on the MipNeRF-360 dataset with substantially less training and inference cost.
Computer Vision and Pattern Recognition,Machine Learning,Image and Video Processing
What problem does this paper attempt to address?
The paper aims to address the problem of high-quality 360° novel view synthesis under sparse viewpoints. Specifically, existing methods such as NeRF (Neural Radiance Fields) and 3D Gaussian Splatting, while performing well in novel view synthesis tasks, typically require a large number of training viewpoints to generate coherent scene representations. When the number of viewpoints is reduced, these methods tend to overfit to the training viewpoints, leading to issues such as background collapse and floaters. Therefore, this paper proposes a method that can train a coherent 3D Gaussian point cloud radiance field under sparse viewpoints to synthesize 360° scenes. The main contributions include: 1. Proposing a new technique that can train 3D radiance fields under sparse viewpoints for 360° unbounded scene novel view synthesis. Experiments show that this technique improves LPIPS by 6.4% over the baseline 3D Gaussian Splatting, improves PSNR by at least 12.2%, and outperforms NeRF-based methods on the MipNeRF-360 dataset by at least 17.6%. 2. Developing a new technique to estimate depth from 3D Gaussian representations, enabling better depth optimization. 3. Introducing a new explicit adaptive operator for pruning "floaters" in the 3D representation. Through these techniques, the authors demonstrate that high-quality novel view synthesis can be achieved even under sparse viewpoints, with significant training and inference cost advantages.