Abstract:Novel-view synthesis (NVS) approaches play a critical role in vast scene reconstruction. However, these methods rely heavily on dense image inputs and prolonged training times, making them unsuitable where computational resources are limited. Additionally, few-shot methods often struggle with poor reconstruction quality in vast environments. This paper presents DGTR, a novel distributed framework for efficient Gaussian reconstruction for sparse-view vast scenes. Our approach divides the scene into regions, processed independently by drones with sparse image inputs. Using a feed-forward Gaussian model, we predict high-quality Gaussian primitives, followed by a global alignment algorithm to ensure geometric consistency. Synthetic views and depth priors are incorporated to further enhance training, while a distillation-based model aggregation mechanism enables efficient reconstruction. Our method achieves high-quality large-scale scene reconstruction and novel-view synthesis in significantly reduced training times, outperforming existing approaches in both speed and scalability. We demonstrate the effectiveness of our framework on vast aerial scenes, achieving high-quality results within minutes. Code will released on our [<a class="link-external link-https" href="https://3d-aigc.github.io/DGTR" rel="external noopener nofollow">this https URL</a>].

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to efficiently perform Novel - view Synthesis (NVS) under sparse viewpoints in large - scale scene reconstruction. Specifically, existing methods either rely on dense image inputs and long - term training, which is impractical in the case of limited computing resources; or result in poor reconstruction quality due to insufficient image data when dealing with large - scale environments. In addition, current methods face significant computing power and storage capacity limitations when dealing with large - scale scenes, especially when collecting images using drones or Unmanned Aerial Vehicles (UAVs), and it is difficult to collect and process the required dense image datasets. To this end, the paper proposes a new distributed framework named DGTR (Distributed Gaussian Turbo - Reconstruction), aiming to overcome the above challenges in the following ways: 1. **Distributed Gaussian Initialization**: Utilize a pre - trained feed - forward Gaussian model and a global alignment algorithm to quickly generate high - quality Gaussian primitives from sparse image inputs and ensure geometric consistency. 2. **Multi - device Parallel Training**: Each device (such as a drone) independently processes sparse images in non - overlapping regions and performs training of local Gaussian models, greatly reducing the initialization and training time. 3. **Distillation - based Model Aggregation**: Upload the trained local models to a central server and use a distillation mechanism to merge these models to form the final high - quality large - scale scene model. Through these innovations, DGTR can achieve high - quality reconstruction and novel - view synthesis of large - scale scenes within a few minutes, which not only improves the speed and quality but also solves the problems encountered by existing methods when dealing with large - scale scenes under sparse viewpoints.

DGTR: Distributed Gaussian Turbo-Reconstruction for Sparse-View Vast Scenes

VastGaussian: Vast 3D Gaussians for Large Scene Reconstruction

VDG: Vision-Only Dynamic Gaussian for Driving Simulation

DrivingRecon: Large 4D Gaussian Reconstruction Model For Autonomous Driving

Scalable Indoor Novel-View Synthesis using Drone-Captured 360 Imagery with 3D Gaussian Splatting

Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers

Horizon-GS: Unified 3D Gaussian Splatting for Large-Scale Aerial-to-Ground Scenes

LM-Gaussian: Boost Sparse-view 3D Gaussian Splatting with Large Model Priors

Optimizing 3D Gaussian Splatting for Sparse Viewpoint Scene Reconstruction

TranSplat: Generalizable 3D Gaussian Splatting from Sparse Multi-View Images with Transformers

RTG-SLAM: Real-time 3D Reconstruction at Scale using Gaussian Splatting

GGRt: Towards Pose-free Generalizable 3D Gaussian Splatting in Real-time

CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes

SparseGS: Real-Time 360° Sparse View Synthesis using Gaussian Splatting

GaRField++: Reinforced Gaussian Radiance Fields for Large-Scale 3D Scene Reconstruction

MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo

Wild-GS: Real-Time Novel View Synthesis from Unconstrained Photo Collections

Periodic Vibration Gaussian: Dynamic Urban Scene Reconstruction and Real-time Rendering

A Hierarchical 3D Gaussian Representation for Real-Time Rendering of Very Large Datasets

HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting

DGNS: Deformable Gaussian Splatting and Dynamic Neural Surface for Monocular Dynamic 3D Reconstruction