Barbie: Text to Barbie-Style 3D Avatars

Xiaokun Sun,Zhenyu Zhang,Ying Tai,Qian Wang,Hao Tang,Zili Yi,Jian Yang
2024-09-24
Abstract:Recent advances in text-guided 3D avatar generation have made substantial progress by distilling knowledge from diffusion models. Despite the plausible generated appearance, existing methods cannot achieve fine-grained disentanglement or high-fidelity modeling between inner body and outfit. In this paper, we propose Barbie, a novel framework for generating 3D avatars that can be dressed in diverse and high-quality Barbie-like garments and accessories. Instead of relying on a holistic model, Barbie achieves fine-grained disentanglement on avatars by semantic-aligned separated models for human body and outfits. These disentangled 3D representations are then optimized by different expert models to guarantee the domain-specific fidelity. To balance geometry diversity and reasonableness, we propose a series of losses for template-preserving and human-prior evolving. The final avatar is enhanced by unified texture refinement for superior texture consistency. Extensive experiments demonstrate that Barbie outperforms existing methods in both dressed human and outfit generation, supporting flexible apparel combination and animation. The code will be released for research purposes. Our project page is: <a class="link-external link-https" href="https://xiaokunsun.github.io/Barbie.github.io/" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve several key problems in the current 3D virtual avatar generation: 1. **Fine - grained Decoupling and High - Fidelity Modeling**: - Existing methods cannot achieve fine - grained disentanglement between the human body and clothing when generating 3D virtual avatars, resulting in insufficient details in generated clothing and accessories and difficulty in flexible combination. - The paper proposes a new framework named Barbie. By using separate models with semantic alignment to process the human body and clothing respectively, it achieves fine - grained decoupling and ensures domain - specific realism. 2. **Geometric Diversity and Rationality**: - To balance geometric diversity and rationality, the paper introduces a series of loss functions, such as template - preserving loss and human - prior evolving loss, to ensure that the generated 3D virtual avatars are both diverse and conform to human morphological characteristics. 3. **High - Quality Texture Consistency**: - In the final virtual avatar synthesis, unified texture refinement is used to enhance the texture consistency generated by different expert models, ensuring a more realistic overall appearance. 4. **Flexibility and Composability**: - The 3D virtual avatars generated by the Barbie framework support flexible clothing combinations and animation production. Users can freely match different clothing and accessories as needed, similar to the design concept of Barbie dolls. ### Main Contributions of the Framework - **Innovative Generation Framework**: Barbie is the first work to achieve fine - grained text - to - 3D virtual avatar generation, and it can generate highly decoupled human bodies, clothing, and accessories. - **Application of Expert Models**: By applying domain - specific expert diffusion models at different optimization stages, the realism of the generated content in terms of geometry and texture is improved. - **Novel Loss Functions and Strategies**: Multiple new loss functions and optimization strategies are proposed to solve the geometric and texture conflict problems that may occur when combining different expert models. ### Experimental Results Through extensive experimental verification, Barbie significantly outperforms existing methods in virtual avatar and clothing generation, showing better geometric structure, texture quality, text - description consistency, and fine - grained decoupling ability. As shown in Table 1 specifically, Barbie achieves the best or second - best results on multiple evaluation criteria. ### Summary The Barbie framework solves the deficiencies in the existing 3D virtual avatar generation through fine - grained decoupling, high - quality texture optimization, and flexible clothing combinations, providing new ideas and technical means for future research and applications.