DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaption by Combining 3D GANs and Diffusion Priors

Biwen Lei,Kai Yu,Mengyang Feng,Miaomiao Cui,Xuansong Xie
DOI: https://doi.org/10.1109/cvpr52733.2024.00998
2024-01-01
Computer Vision and Pattern Recognition
Abstract:Text-guided domain adaptation and generation of 3D-aware portraits find manyapplications in various fields. However, due to the lack of training data andthe challenges in handling the high variety of geometry and appearance, theexisting methods for these tasks suffer from issues like inflexibility,instability, and low fidelity. In this paper, we propose a novel frameworkDiffusionGAN3D, which boosts text-guided 3D domain adaptation and generation bycombining 3D GANs and diffusion priors. Specifically, we integrate thepre-trained 3D generative models (e.g., EG3D) and text-to-image diffusionmodels. The former provides a strong foundation for stable and high-qualityavatar generation from text. And the diffusion models in turn offer powerfulpriors and guide the 3D generator finetuning with informative direction toachieve flexible and efficient text-guided domain adaptation. To enhance thediversity in domain adaptation and the generation capability in text-to-avatar,we introduce the relative distance loss and case-specific learnable triplanerespectively. Besides, we design a progressive texture refinement module toimprove the texture quality for both tasks above. Extensive experimentsdemonstrate that the proposed framework achieves excellent results in bothdomain adaptation and text-to-avatar tasks, outperforming existing methods interms of generation quality and efficiency. The project homepage is athttps://younglbw.github.io/DiffusionGAN3D-homepage/.
What problem does this paper attempt to address?