DGTR: Distributed Gaussian Turbo-Reconstruction for Sparse-View Vast Scenes

Hao Li,Yuanyuan Gao,Haosong Peng,Chenming Wu,Weicai Ye,Yufeng Zhan,Chen Zhao,Dingwen Zhang,Jingdong Wang,Junwei Han
2024-11-20
Abstract:Novel-view synthesis (NVS) approaches play a critical role in vast scene reconstruction. However, these methods rely heavily on dense image inputs and prolonged training times, making them unsuitable where computational resources are limited. Additionally, few-shot methods often struggle with poor reconstruction quality in vast environments. This paper presents DGTR, a novel distributed framework for efficient Gaussian reconstruction for sparse-view vast scenes. Our approach divides the scene into regions, processed independently by drones with sparse image inputs. Using a feed-forward Gaussian model, we predict high-quality Gaussian primitives, followed by a global alignment algorithm to ensure geometric consistency. Synthetic views and depth priors are incorporated to further enhance training, while a distillation-based model aggregation mechanism enables efficient reconstruction. Our method achieves high-quality large-scale scene reconstruction and novel-view synthesis in significantly reduced training times, outperforming existing approaches in both speed and scalability. We demonstrate the effectiveness of our framework on vast aerial scenes, achieving high-quality results within minutes. Code will released on our [<a class="link-external link-https" href="https://3d-aigc.github.io/DGTR" rel="external noopener nofollow">this https URL</a>].
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to efficiently perform Novel - view Synthesis (NVS) under sparse viewpoints in large - scale scene reconstruction. Specifically, existing methods either rely on dense image inputs and long - term training, which is impractical in the case of limited computing resources; or result in poor reconstruction quality due to insufficient image data when dealing with large - scale environments. In addition, current methods face significant computing power and storage capacity limitations when dealing with large - scale scenes, especially when collecting images using drones or Unmanned Aerial Vehicles (UAVs), and it is difficult to collect and process the required dense image datasets. To this end, the paper proposes a new distributed framework named DGTR (Distributed Gaussian Turbo - Reconstruction), aiming to overcome the above challenges in the following ways: 1. **Distributed Gaussian Initialization**: Utilize a pre - trained feed - forward Gaussian model and a global alignment algorithm to quickly generate high - quality Gaussian primitives from sparse image inputs and ensure geometric consistency. 2. **Multi - device Parallel Training**: Each device (such as a drone) independently processes sparse images in non - overlapping regions and performs training of local Gaussian models, greatly reducing the initialization and training time. 3. **Distillation - based Model Aggregation**: Upload the trained local models to a central server and use a distillation mechanism to merge these models to form the final high - quality large - scale scene model. Through these innovations, DGTR can achieve high - quality reconstruction and novel - view synthesis of large - scale scenes within a few minutes, which not only improves the speed and quality but also solves the problems encountered by existing methods when dealing with large - scale scenes under sparse viewpoints.