GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction

Yuxuan Mu,Xinxin Zuo,Chuan Guo,Yilin Wang,Juwei Lu,Xiaofeng Wu,Songcen Xu,Peng Dai,Youliang Yan,Li Cheng
2024-10-30
Abstract:We present GSD, a diffusion model approach based on Gaussian Splatting (GS) representation for 3D object reconstruction from a single view. Prior works suffer from inconsistent 3D geometry or mediocre rendering quality due to improper representations. We take a step towards resolving these shortcomings by utilizing the recent state-of-the-art 3D explicit representation, Gaussian Splatting, and an unconditional diffusion model. This model learns to generate 3D objects represented by sets of GS ellipsoids. With these strong generative 3D priors, though learning unconditionally, the diffusion model is ready for view-guided reconstruction without further model fine-tuning. This is achieved by propagating fine-grained 2D features through the efficient yet flexible splatting function and the guided denoising sampling process. In addition, a 2D diffusion model is further employed to enhance rendering fidelity, and improve reconstructed GS quality by polishing and re-using the rendered images. The final reconstructed objects explicitly come with high-quality 3D structure and texture, and can be efficiently rendered in arbitrary views. Experiments on the challenging real-world CO3D dataset demonstrate the superiority of our approach. Project page: <a class="link-external link-https" href="https://yxmu.foo/GSD/" rel="external noopener nofollow">this https URL</a>
Computer Vision and Pattern Recognition,Graphics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is high - quality 3D reconstruction of single - view images. Specifically, existing methods have problems of geometric inconsistency or poor rendering quality when reconstructing 3D objects from images of a single view. These problems mainly stem from inappropriate representation methods. The paper proposes a diffusion model method (GSD) based on Gaussian Splatting (GS) representation, aiming to address these shortcomings by leveraging the latest explicit 3D representation method - Gaussian Splatting, and unconditional diffusion models. This method can generate 3D objects represented by a set of GS ellipsoids, and can achieve view - guided reconstruction by propagating fine - grained 2D features and guiding the denoising sampling process even in the case of unconditional learning, without further fine - tuning the model. In addition, a 2D diffusion model is also adopted to enhance the rendering fidelity and improve the reconstructed GS quality by optimizing and reusing the rendered images. The finally reconstructed objects have high - quality 3D structures and textures and can be efficiently rendered at any viewing angle. Experimental results show that this method performs excellently when dealing with challenging real - world datasets such as CO3D.