Efficient 3D View Synthesis from Single-Image Utilizing Diffusion Priors

Yifan Wen,Zitong Wang,Zhuoyuan Li,Dongxing Wei,Yi Sun
DOI: https://doi.org/10.1007/978-981-97-4399-5_9
2024-01-01
Abstract:In this paper, we introduce a novel framework for synthesizing novel views of objects from a single image. Leveraging the capabilities of fine-tuned diffusion models, our method combines latent 3D knowledge as priors to reconstruct 3D scenes. This facilitates the generation of high-fidelity 3D content from a solitary 2D viewpoint. We employ a two-stage process, beginning with fine-tuning a diffusion model on a given image viewpoint, followed by optimizing a neural radiance field using score distillation sampling (SDS). Our technique not only ensures fidelity to the original image but also enhances the perceptual understanding of the object in three dimensions. This method is effective for a wide range of objects, irrespective of the need for training on multiple views, and is applicable to both real-world and synthetic datasets. The resultant 3D reconstructions exhibit detailed geometry and realistic textures, closely matching the input images.
What problem does this paper attempt to address?