Consistent123: One Image to Highly Consistent 3D Asset Using Case-Aware Diffusion Priors

Yukang Lin,Haonan Han,Chaoqun Gong,Zunnan Xu,Yachao Zhang,Xiu Li
DOI: https://doi.org/10.1145/3664647.3680994
2024-01-01
Abstract:Reconstructing 3D objects from a single image guided by pretrained diffusionmodels has demonstrated promising outcomes. However, due to utilizing thecase-agnostic rigid strategy, their generalization ability to arbitrary casesand the 3D consistency of reconstruction are still poor. In this work, wepropose Consistent123, a case-aware two-stage method for highly consistent 3Dasset reconstruction from one image with both 2D and 3D diffusion priors. Inthe first stage, Consistent123 utilizes only 3D structural priors forsufficient geometry exploitation, with a CLIP-based case-aware adaptivedetection mechanism embedded within this process. In the second stage, 2Dtexture priors are introduced and progressively take on a dominant guidingrole, delicately sculpting the details of the 3D model. Consistent123 alignsmore closely with the evolving trends in guidance requirements, adaptivelyproviding adequate 3D geometric initialization and suitable 2D texturerefinement for different objects. Consistent123 can obtain highly 3D-consistentreconstruction and exhibits strong generalization ability across variousobjects. Qualitative and quantitative experiments show that our methodsignificantly outperforms state-of-the-art image-to-3D methods. Seehttps://Consistent123.github.io for a more comprehensive exploration of ourgenerated 3D assets.
What problem does this paper attempt to address?