Abstract:Recent 3D generative models have achieved remarkable performance in synthesizing high resolution photorealistic images with view consistency and detailed 3D shapes, but training them for diverse domains is challenging since it requires massive training images and their camera distribution information. Text-guided domain adaptation methods have shown impressive performance on converting the 2D generative model on one domain into the models on other domains with different styles by leveraging the CLIP (Contrastive Language-Image Pre-training), rather than collecting massive datasets for those domains. However, one drawback of them is that the sample diversity in the original generative model is not well-preserved in the domain-adapted generative models due to the deterministic nature of the CLIP text encoder. Text-guided domain adaptation will be even more challenging for 3D generative models not only because of catastrophic diversity loss, but also because of inferior text-image correspondence and poor image quality. Here we propose DATID-3D, a domain adaptation method tailored for 3D generative models using text-to-image diffusion models that can synthesize diverse images per text prompt without collecting additional images and camera information for the target domain. Unlike 3D extensions of prior text-guided domain adaptation methods, our novel pipeline was able to fine-tune the state-of-the-art 3D generator of the source domain to synthesize high resolution, multi-view consistent images in text-guided targeted domains without additional data, outperforming the existing text-guided domain adaptation methods in diversity and text-image correspondence. Furthermore, we propose and demonstrate diverse 3D image manipulations such as one-shot instance-selected adaptation and single-view manipulated 3D reconstruction to fully enjoy diversity in text.

DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaption by Combining 3D GANs and Diffusion Priors

DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaptation by Combining 3D GANs and Diffusion Priors

StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation

Guide3D: Create 3D Avatars from Text and Image Guidance

PODIA-3D: Domain Adaptation of 3D Generative Model Across Large Domain Gap Using Pose-Preserved Text-to-Image Diffusion

DATID-3D: Diversity-Preserved Domain Adaptation Using Text-to-Image Diffusion for 3D Generative Model

Portrait3D: Text-Guided High-Quality 3D Portrait Generation Using Pyramid Representation and GANs Prior

Articulated 3D Head Avatar Generation using Text-to-Image Diffusion Models

IT3D: Improved Text-to-3D Generation with Explicit View Synthesis

Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy

3D-Adapter: Geometry-Consistent Multi-View Diffusion for High-Quality 3D Generation

3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models

ET3D: Efficient Text-to-3D Generation via Multi-View Distillation

Text-Driven Diverse Facial Texture Generation via Progressive Latent-Space Refinement

PI3D: Efficient Text-to-3D Generation with Pseudo-Image Diffusion

Text-Guided 3D Face Synthesis -- From Generation to Editing

Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation

GenesisTex2: Stable, Consistent and High-Quality Text-to-Texture Generation

VividDreamer: Towards High-Fidelity and Efficient Text-to-3D Generation

Human 3Diffusion: Realistic Avatar Creation via Explicit 3D Consistent Diffusion Models