Splatfacto-W: A Nerfstudio Implementation of Gaussian Splatting for Unconstrained Photo Collections

Congrong Xu,Justin Kerr,Angjoo Kanazawa
2024-09-29
Abstract:Novel view synthesis from unconstrained in-the-wild image collections remains a significant yet challenging task due to photometric variations and transient occluders that complicate accurate scene reconstruction. Previous methods have approached these issues by integrating per-image appearance features embeddings in Neural Radiance Fields (NeRFs). Although 3D Gaussian Splatting (3DGS) offers faster training and real-time rendering, adapting it for unconstrained image collections is non-trivial due to the substantially different architecture. In this paper, we introduce Splatfacto-W, an approach that integrates per-Gaussian neural color features and per-image appearance embeddings into the rasterization process, along with a spherical harmonics-based background model to represent varying photometric appearances and better depict backgrounds. Our key contributions include latent appearance modeling, efficient transient object handling, and precise background modeling. Splatfacto-W delivers high-quality, real-time novel view synthesis with improved scene consistency in in-the-wild scenarios. Our method improves the Peak Signal-to-Noise Ratio (PSNR) by an average of 5.3 dB compared to 3DGS, enhances training speed by 150 times compared to NeRF-based methods, and achieves a similar rendering speed to 3DGS. Additional video results and code integrated into Nerfstudio are available at <a class="link-external link-https" href="https://kevinxu02.github.io/splatfactow/" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper aims to address the problem of novel view synthesis in unconstrained wild image collections. Specifically, the paper proposes the Splatfacto-W method to tackle the challenges in scene reconstruction caused by lighting changes and transient occluders that traditional methods face. Existing Neural Radiance Field (NeRF) methods can capture these variations but are slow in optimization and rendering; while 3D Gaussian Splatting (3DGS) is fast in training and can render in real-time, it struggles with unconstrained image sets. Splatfacto-W improves 3DGS in the following three aspects: 1. **Implicit Appearance Modeling**: Assigns appearance features to each Gaussian point, allowing Gaussian colors to adapt to changes in the reference images and convert them into explicit colors, ensuring rendering speed. 2. **Transient Object Handling**: Proposes an efficient mask-based method to handle transient objects, enhancing the focus on consistent scene features without relying on 2D pre-trained models. 3. **Background Modeling**: Utilizes a Spherical Harmonics-based background model to accurately represent the sky and background elements, ensuring multi-view consistency. With these improvements, Splatfacto-W achieves an average increase of 5.3 dB in Peak Signal-to-Noise Ratio (PSNR) compared to 3DGS, trains 150 times faster than NeRF-based methods, and achieves comparable rendering speed to 3DGS. Additionally, the method performs excellently on various challenging datasets, enabling real-time novel view synthesis and supporting dynamic appearance changes.