CoherentGS: Sparse Novel View Synthesis with Coherent 3D Gaussians

Avinash Paliwal,Wei Ye,Jinhui Xiong,Dmytro Kotovenko,Rakesh Ranjan,Vikas Chandra,Nima Khademi Kalantari
2024-03-28
Abstract:The field of 3D reconstruction from images has rapidly evolved in the past few years, first with the introduction of Neural Radiance Field (NeRF) and more recently with 3D Gaussian Splatting (3DGS). The latter provides a significant edge over NeRF in terms of the training and inference speed, as well as the reconstruction quality. Although 3DGS works well for dense input images, the unstructured point-cloud like representation quickly overfits to the more challenging setup of extremely sparse input images (e.g., 3 images), creating a representation that appears as a jumble of needles from novel views. To address this issue, we propose regularized optimization and depth-based initialization. Our key idea is to introduce a structured Gaussian representation that can be controlled in 2D image space. We then constraint the Gaussians, in particular their position, and prevent them from moving independently during optimization. Specifically, we introduce single and multiview constraints through an implicit convolutional decoder and a total variation loss, respectively. With the coherency introduced to the Gaussians, we further constrain the optimization through a flow-based loss function. To support our regularized optimization, we propose an approach to initialize the Gaussians using monocular depth estimates at each input view. We demonstrate significant improvements compared to the state-of-the-art sparse-view NeRF-based approaches on a variety of scenes.
Computer Vision and Pattern Recognition,Graphics
What problem does this paper attempt to address?
The paper attempts to address the problem of how to utilize 3D Gaussian Splatting (3DGS) technology for high-quality novel view synthesis given very sparse input images (e.g., 3 images). Specifically, existing 3DGS methods tend to overfit when dealing with sparse input images, resulting in incoherent representations when viewed from new perspectives. As shown in Figure 1, the generated images appear as randomly distributed needle-like artifacts. To solve this problem, the authors propose introducing structured Gaussian representations and consistency constraints, improving the quality of 3D reconstruction under sparse views through regularization optimization and depth-based initialization. The main contributions of the paper include: 1. Proposing a method for 3D reconstruction using 3DGS from extremely sparse input images. 2. Introducing structured Gaussian representations and incorporating consistency through various regularization methods. 3. Proposing a depth-based 3D Gaussian initialization method to complement the regularization optimization process. Through these methods, the authors aim to overcome the limitations of existing 3DGS methods when handling sparse inputs and improve the quality of novel view synthesis.