DreamMapping: High-Fidelity Text-to-3D Generation via Variational Distribution Mapping

Zeyu Cai,Duotun Wang,Yixun Liang,Zhijing Shao,Ying-Cong Chen,Xiaohang Zhan,Zeyu Wang

2024-09-20

Abstract:Score Distillation Sampling (SDS) has emerged as a prevalent technique for text-to-3D generation, enabling 3D content creation by distilling view-dependent information from text-to-2D guidance. However, they frequently exhibit shortcomings such as over-saturated color and excess smoothness. In this paper, we conduct a thorough analysis of SDS and refine its formulation, finding that the core design is to model the distribution of rendered images. Following this insight, we introduce a novel strategy called Variational Distribution Mapping (VDM), which expedites the distribution modeling process by regarding the rendered images as instances of degradation from diffusion-based generation. This special design enables the efficient training of variational distribution by skipping the calculations of the Jacobians in the diffusion U-Net. We also introduce timestep-dependent Distribution Coefficient Annealing (DCA) to further improve distilling precision. Leveraging VDM and DCA, we use Gaussian Splatting as the 3D representation and build a text-to-3D generation framework. Extensive experiments and evaluations demonstrate the capability of VDM and DCA to generate high-fidelity and realistic assets with optimization efficiency.

Computer Vision and Pattern Recognition,Graphics

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that in text - to - 3D generation, existing methods such as Score Distillation Sampling (SDS) have some deficiencies, for example, the generated 3D model has problems such as oversaturated colors and overly smooth surfaces. These problems limit the quality and detail representation of the generated 3D assets. To solve these problems, the author proposes a new strategy - Variational Distribution Mapping (VDM), and the Distribution Coefficient Annealing (DCA) strategy which is time - step - dependent. These new methods aim to improve the quality and efficiency of 3D generation by more effectively modeling the distribution of rendered images, thereby generating high - fidelity, highly realistic 3D assets, and the optimization process is faster. Specifically, the paper regards the rendered image as a degraded form of the image generated by the diffusion model by introducing a trainable degradation process, thus avoiding complex Jacobian matrix calculations in the UNet of the diffusion model. In addition, the paper also analyzes the mode - seeking behavior in SDS and finds that the correlation between the distribution of the generated image and the rendered image weakens as the time step decreases. Based on this observation, the DCA strategy is proposed, which further improves the generation quality by applying time - dependent coefficients to adapt to the dynamic changes of the rendered image distribution. In summary, the goal of this paper is to improve the existing text - to - 3D generation technology so that it can generate more detailed and realistic 3D models while maintaining an efficient optimization process.

DreamMapping: High-Fidelity Text-to-3D Generation via Variational Distribution Mapping

ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation

LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching

VividDreamer: Invariant Score Distillation For Hyper-Realistic Text-to-3D Generation

BoostDream: Efficient Refining for High-Quality Text-to-3D Generation from Multi-View Diffusion

StableDreamer: Taming Noisy Score Distillation Sampling for Text-to-3D

Retrieval-Augmented Score Distillation for Text-to-3D Generation

AnimatableDreamer: Text-Guided Non-rigid 3D Model Generation and Reconstruction with Canonical Score Distillation

VividDreamer: Towards High-Fidelity and Efficient Text-to-3D Generation

JointDreamer: Ensuring Geometry Consistency and Text Congruence in Text-to-3D Generation via Joint Score Distillation

ExactDreamer: High-Fidelity Text-to-3D Content Creation via Exact Score Matching

Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching

FlowDreamer: Exploring High Fidelity Text-to-3D Generation via Rectified Flow

Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model

DreamScape: 3D Scene Creation via Gaussian Splatting joint Correlation Modeling

DreamTime: An Improved Optimization Strategy for Diffusion-Guided 3D Generation

Text-to-3D Using Gaussian Splatting

Semantic Score Distillation Sampling for Compositional Text-to-3D Generation

Consistent3D: Towards Consistent High-Fidelity Text-to-3D Generation with Deterministic Sampling Prior

Diverse and Stable 2D Diffusion Guided Text to 3D Generation with Noise Recalibration