HyperDreamer: Hyper-Realistic 3D Content Generation and Editing from a Single Image

Tong Wu,Zhibing Li,Shuai Yang,Pan Zhang,Xinggang Pan,Jiaqi Wang,Dahua Lin,Ziwei Liu
2023-12-08
Abstract:3D content creation from a single image is a long-standing yet highly desirable task. Recent advances introduce 2D diffusion priors, yielding reasonable results. However, existing methods are not hyper-realistic enough for post-generation usage, as users cannot view, render and edit the resulting 3D content from a full range. To address these challenges, we introduce HyperDreamer with several key designs and appealing properties: 1) Viewable: 360 degree mesh modeling with high-resolution textures enables the creation of visually compelling 3D models from a full range of observation points. 2) Renderable: Fine-grained semantic segmentation and data-driven priors are incorporated as guidance to learn reasonable albedo, roughness, and specular properties of the materials, enabling semantic-aware arbitrary material estimation. 3) Editable: For a generated model or their own data, users can interactively select any region via a few clicks and efficiently edit the texture with text-based guidance. Extensive experiments demonstrate the effectiveness of HyperDreamer in modeling region-aware materials with high-resolution textures and enabling user-friendly editing. We believe that HyperDreamer holds promise for advancing 3D content creation and finding applications in various domains.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper attempts to solve the problem of generating and editing surreal 3D content from a single image. Specifically, existing 3D content generation methods have two main problems: 1. **Limited post - generation usability**: Current methods usually use implicit 3D representations, which sacrifice usability while improving fidelity. Users cannot freely scale, re - render or edit the generated 3D content, which limits its practical applications and creative potential. 2. **2D diffusion bias**: These methods rely on diffusion models trained on 2D datasets with rich lighting and shadow variations. Although these variations enhance the realism of 2D images, they also cause unwanted effects in the textures of 3D models (as shown in Figure 4 - d). To solve these problems, the authors propose the HyperDreamer framework, which has the following three key features: 1. **Full - range viewability**: By introducing a novel custom super - resolution module, HyperDreamer can generate high - resolution textures and support viewing 3D models from any viewing angle. 2. **Full - range renderability**: Combining online 3D semantic segmentation and spatially - varying bidirectional reflectance distribution function (BRDF), HyperDreamer can learn the reasonable albedo, roughness and specular reflection properties of materials, thus achieving more realistic rendering. 3. **Full - range editability**: HyperDreamer allows users to select specific areas by simple clicking and efficiently edit textures based on text guidance, greatly improving the flexibility and ease of use of editing. Through these improvements, HyperDreamer not only significantly improves the quality of 3D content generation, but also expands the accessibility and practicality of AI - generated 3D content in practical applications.