VividDreamer: Invariant Score Distillation For Hyper-Realistic Text-to-3D Generation

Wenjie Zhuo,Fan Ma,Hehe Fan,Yi Yang
2024-07-17
Abstract:This paper presents Invariant Score Distillation (ISD), a novel method for high-fidelity text-to-3D generation. ISD aims to tackle the over-saturation and over-smoothing problems in Score Distillation Sampling (SDS). In this paper, SDS is decoupled into a weighted sum of two components: the reconstruction term and the classifier-free guidance term. We experimentally found that over-saturation stems from the large classifier-free guidance scale and over-smoothing comes from the reconstruction term. To overcome these problems, ISD utilizes an invariant score term derived from DDIM sampling to replace the reconstruction term in SDS. This operation allows the utilization of a medium classifier-free guidance scale and mitigates the reconstruction-related errors, thus preventing the over-smoothing and over-saturation of results. Extensive experiments demonstrate that our method greatly enhances SDS and produces realistic 3D objects through single-stage optimization.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the over - saturation and over - smoothing problems encountered by Score Distillation Sampling (SDS) when generating highly - realistic text - to - 3D objects. Specifically, SDS is a method for generating 3D objects from text, which relies on pre - trained text - to - image diffusion models to guide the optimization of 3D representations. However, SDS encounters two main problems in practical applications: 1. **Over - saturation**: Due to the use of a large classifier - free guidance scale, the generated image has too many details and looks unnatural. 2. **Over - smoothing**: The existence of the reconstruction term makes the generated image blurry and smooth in some cases, losing details. To solve these problems, the authors propose the **Invariant Score Distillation (ISD)** method. ISD replaces the reconstruction term in SDS by introducing an invariant score term. This invariant score term is derived from DDIM sampling, which can avoid single - step reconstruction errors and continuously refine the score prior during the generation process, thereby avoiding over - saturation and over - smoothing problems. ### Main contributions 1. **Problem analysis**: The authors decompose SDS into a classifier - free guidance term and a reconstruction term, and find that the over - saturation problem stems from the large classifier - free guidance scale, while the over - smoothing problem stems from the reconstruction term. 2. **Proposing ISD**: ISD replaces the reconstruction term with an invariant score term, solving the inherent problems of SDS. 3. **Experimental verification**: Through quantitative and qualitative experiments, it is proved that ISD can generate high - fidelity 3D objects in a single - stage optimization and outperforms existing methods in performance. Through these improvements, ISD significantly improves the quality of text - to - 3D generation and solves the key problems existing in traditional methods.