SWAG: Splatting in the Wild images with Appearance-conditioned Gaussians

Hiba Dahmani,Moussab Bennehar,Nathan Piasco,Luis Roldao,Dzmitry Tsishkou
2024-04-06
Abstract:Implicit neural representation methods have shown impressive advancements in learning 3D scenes from unstructured in-the-wild photo collections but are still limited by the large computational cost of volumetric rendering. More recently, 3D Gaussian Splatting emerged as a much faster alternative with superior rendering quality and training efficiency, especially for small-scale and object-centric scenarios. Nevertheless, this technique suffers from poor performance on unstructured in-the-wild data. To tackle this, we extend over 3D Gaussian Splatting to handle unstructured image collections. We achieve this by modeling appearance to seize photometric variations in the rendered images. Additionally, we introduce a new mechanism to train transient Gaussians to handle the presence of scene occluders in an unsupervised manner. Experiments on diverse photo collection scenes and multi-pass acquisition of outdoor landmarks show the effectiveness of our method over prior works achieving state-of-the-art results with improved efficiency.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper mainly addresses the problem that the 3D Gaussian Splatting (3DGS) method does not perform well in dealing with lighting variations, dynamic objects, and occlusions when processing an unconstrained collection of natural environment photos. Although the existing Neural Radiance Fields (NeRF) method excels in rendering realistic novel views, it has poor performance in handling dynamic scenes. To address these issues, the paper proposes SWAG (Splatting in the Wild with Appearance-conditioned Gaussians), which is the first natural environment extension for 3DGS. SWAG improves 3DGS in the following ways: 1. It introduces a learning-based embedding space to capture the appearance of each image, adapting to the photometric variations in rendering images. 2. It learns opacity variations related to the images to better handle dynamic objects and enhance the precision of scene reconstruction. 3. It proposes an unsupervised mechanism to train transient Gaussian distributions for handling occlusions in the scene. Experimental results show that SWAG not only improves the performance of 3DGS in various scenes but also achieves state-of-the-art levels in training and rendering speeds, surpassing previous works in rendering quality. Additionally, SWAG can generate new images with smooth visual transitions and remove dynamic objects from the captured scene in an unsupervised manner.