DreamBeast: Distilling 3D Fantastical Animals with Part-Aware Knowledge Transfer

Runjia Li,Junlin Han,Luke Melas-Kyriazi,Chunyi Sun,Zhaochong An,Zhongrui Gui,Shuyang Sun,Philip Torr,Tomas Jakab
2024-09-13
Abstract:We present DreamBeast, a novel method based on score distillation sampling (SDS) for generating fantastical 3D animal assets composed of distinct parts. Existing SDS methods often struggle with this generation task due to a limited understanding of part-level semantics in text-to-image diffusion models. While recent diffusion models, such as Stable Diffusion 3, demonstrate a better part-level understanding, they are prohibitively slow and exhibit other common problems associated with single-view diffusion models. DreamBeast overcomes this limitation through a novel part-aware knowledge transfer mechanism. For each generated asset, we efficiently extract part-level knowledge from the Stable Diffusion 3 model into a 3D Part-Affinity implicit representation. This enables us to instantly generate Part-Affinity maps from arbitrary camera views, which we then use to modulate the guidance of a multi-view diffusion model during SDS to create 3D assets of fantastical animals. DreamBeast significantly enhances the quality of generated 3D creatures with user-specified part compositions while reducing computational overhead, as demonstrated by extensive quantitative and qualitative evaluations.
Computer Vision and Pattern Recognition,Graphics,Machine Learning,Image and Video Processing
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the problems encountered when generating complex, multi - part composite 3D fantasy animal assets. Specifically, the existing Score Distillation Sampling (SDS) - based methods face the following challenges when generating 3D creatures composed of different animal parts: 1. **Insufficient part - level semantic understanding**: The existing SDS methods have limited part - level semantic understanding in text - to - image diffusion models, resulting in the inability to accurately generate 3D creatures with specific part combinations. 2. **High computational cost**: Although recent diffusion models (such as Stable Diffusion 3, SD3) have made some improvements in part - level understanding, they are very slow and have other common single - view diffusion model problems. 3. **Lack of part - level control**: Existing methods are difficult to perform effective part - level control according to the text descriptions of specific parts. To solve these problems, the paper proposes **DreamBeast**, a new SDS - based method that generates 3D fantasy creatures composed of different animal parts by introducing a part - aware knowledge transfer mechanism. The main contributions of DreamBeast include: 1. **Proposing the part - level text - to - 3D generation problem in an open - world setting for the first time**. 2. **Proposing a new knowledge transfer mechanism** that can efficiently transfer the part - level understanding of 2D diffusion models to the 3D generation process. 3. **Significantly improving the generation quality and reducing the computational cost**, making the generation of part - aware 3D animal assets more efficient and reliable. 4. **Verifying the effectiveness of the method through quantitative evaluation and human studies**, proving that it is superior to the baseline methods in generating part - aware 3D creatures. Through these improvements, DreamBeast can significantly reduce the generation time while maintaining high quality, from 7 hours to 78 minutes, and also reduce the GPU memory usage.