Abstract:Gaussian splatting has gained attention for its efficient representation and rendering of 3D scenes using continuous Gaussian primitives. However, it struggles with sparse-view inputs due to limited geometric and photometric information, causing ambiguities in depth, shape, and texture. we propose GBR: Generative Bundle Refinement, a method for high-fidelity Gaussian splatting and meshing using only 4-6 input views. GBR integrates a neural bundle adjustment module to enhance geometry accuracy and a generative depth refinement module to improve geometry fidelity. More specifically, the neural bundle adjustment module integrates a foundation network to produce initial 3D point maps and point matches from unposed images, followed by bundle adjustment optimization to improve multiview consistency and point cloud accuracy. The generative depth refinement module employs a diffusion-based strategy to enhance geometric details and fidelity while preserving the scale. Finally, for Gaussian splatting optimization, we propose a multimodal loss function incorporating depth and normal consistency, geometric regularization, and pseudo-view supervision, providing robust guidance under sparse-view conditions. Experiments on widely used datasets show that GBR significantly outperforms existing methods under sparse-view inputs. Additionally, GBR demonstrates the ability to reconstruct and render large-scale real-world scenes, such as the Pavilion of Prince Teng and the Great Wall, with remarkable details using only 6 views.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: achieving high - fidelity Gaussian Splatting and Meshing with only a small number of viewpoints (4 - 6 input views). Specifically, existing methods face the following challenges when dealing with sparse - view inputs: 1. **Geometric Accuracy**: Traditional methods usually rely on Structure - from - Motion (SfM) to generate initial point clouds and view poses for high - precision 3D reconstruction. However, with sparse - view inputs, SfM has difficulty generating a sufficiently dense and complete point cloud, thus limiting the geometric accuracy. 2. **Mesh Fidelity**: A high - precision point cloud alone is not sufficient to reconstruct a high - fidelity mesh. Due to the limited multi - view information provided by sparse views, geometric details are easily lost during the Gaussian Primitive optimization process, resulting in a decline in the quality of mesh reconstruction. 3. **Insufficient Supervision**: Sparse - view inputs provide limited supervision signals for Gaussian Primitive optimization and are prone to getting trapped in local minima. Therefore, it is necessary to design effective loss functions and regularization terms to better guide the optimization process. To address these challenges, the authors propose GBR (Generative Bundle Refinement), an effective framework for high - fidelity Gaussian Splatting and Meshing. GBR addresses the above problems through the following key components: 1. **Neural Bundle Adjustment Module**: Combine the traditional bundle adjustment optimizer with deep - learning - based geometric estimators (such as the DUSt3R network) to improve geometric accuracy and point cloud density. 2. **Generative Depth Refinement Module**: Utilize diffusion models to integrate high - resolution RGB information into the point cloud, enhancing geometric details and smoothness while maintaining the consistency of the depth scale. 3. **Multimodal Loss Function**: Combine depth, normal, geometric consistency, pseudo - view synthesis, and photometric loss to provide stronger supervision signals, making Gaussian Primitive optimization more accurate and robust. Through these innovations, GBR can achieve high - quality camera parameter estimation, depth/normal map estimation, new - view synthesis, and mesh reconstruction with only 4 - 6 input views. Experimental results show that GBR performs significantly better than existing methods under sparse - view inputs and can reconstruct and render large - scale real - world scenes, such as Tengwang Pavilion and the Great Wall, with an extremely high level of detail.

GBR: Generative Bundle Refinement for High-fidelity Gaussian Splatting and Meshing

Unbounded-GS: Extending 3D Gaussian Splatting with Hybrid Representation for Unbounded Large-Scale Scene Reconstruction

GeoGaussian: Geometry-aware Gaussian Splatting for Scene Rendering

GaussianPro: 3D Gaussian Splatting with Progressive Propagation

Optimizing 3D Gaussian Splatting for Sparse Viewpoint Scene Reconstruction

LM-Gaussian: Boost Sparse-view 3D Gaussian Splatting with Large Model Priors

GS^3: Efficient Relighting with Triple Gaussian Splatting

PGSR: Planar-based Gaussian Splatting for Efficient and High-Fidelity Surface Reconstruction

6DGS: Enhanced Direction-Aware Gaussian Splatting for Volumetric Rendering

RaDe-GS: Rasterizing Depth in Gaussian Splatting

MeshGS: Adaptive Mesh-Aligned Gaussian Splatting for High-Quality Rendering

GPS-Gaussian+: Generalizable Pixel-wise 3D Gaussian Splatting for Real-Time Human-Scene Rendering from Sparse Views

MVG-Splatting: Multi-View Guided Gaussian Splatting with Adaptive Quantile-Based Geometric Consistency Densification

GaussianRoom: Improving 3D Gaussian Splatting with SDF Guidance and Monocular Cues for Indoor Scene Reconstruction

ResGS: Residual Densification of 3D Gaussian for Efficient Detail Recovery

GS3: Efficient Relighting with Triple Gaussian Splatting

SRGS: Super-Resolution 3D Gaussian Splatting

SplatFields: Neural Gaussian Splats for Sparse 3D and 4D Reconstruction

AAGS: Appearance-Aware 3D Gaussian Splattingwith Unconstrained Photo Collections

GGRt: Towards Pose-free Generalizable 3D Gaussian Splatting in Real-time