3D Face Modeling via Weakly-supervised Disentanglement Network joint Identity-consistency Prior

Guohao Li,Hongyu Yang,Di Huang,Yunhong Wang
2024-04-25
Abstract:Generative 3D face models featuring disentangled controlling factors hold immense potential for diverse applications in computer vision and computer graphics. However, previous 3D face modeling methods face a challenge as they demand specific labels to effectively disentangle these factors. This becomes particularly problematic when integrating multiple 3D face datasets to improve the generalization of the model. Addressing this issue, this paper introduces a Weakly-Supervised Disentanglement Framework, denoted as WSDF, to facilitate the training of controllable 3D face models without an overly stringent labeling requirement. Adhering to the paradigm of Variational Autoencoders (VAEs), the proposed model achieves disentanglement of identity and expression controlling factors through a two-branch encoder equipped with dedicated identity-consistency prior. It then faithfully re-entangles these factors via a tensor-based combination mechanism. Notably, the introduction of the Neutral Bank allows precise acquisition of subject-specific information using only identity labels, thereby averting degeneration due to insufficient supervision. Additionally, the framework incorporates a label-free second-order loss function for the expression factor to regulate deformation space and eliminate extraneous information, resulting in enhanced disentanglement. Extensive experiments have been conducted to substantiate the superior performance of WSDF. Our code is available at
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper presents the challenge of separating identity and expression factors in 3D face modeling. Existing methods for 3D face modeling require specific labels to effectively separate these factors, which becomes particularly difficult when integrating multiple 3D face datasets to improve model generalization. To address this, the paper introduces a weakly supervised separation framework (WSDF) based on the variational autoencoder (VAE) architecture, using a dual-branch encoder and introducing an identity consistency prior to achieve the separation of identity and expression. The model accurately captures subject-specific information using only identity labels through a neutral bank module, avoiding degradation caused by insufficient supervision. Additionally, an unlabeled second-order loss function is proposed to regularize the expression space, eliminating extra information and enhancing separation performance. Experimental results demonstrate the superior performance of WSDF.