Learning Pose Controllable Human Reconstruction with Dynamic Implicit Fields from a Single Image

Jituo Li,Xinqi Liu,Guodong Lu
DOI: https://doi.org/10.1109/tvcg.2024.3363493
IF: 5.2
2024-01-01
IEEE Transactions on Visualization and Computer Graphics
Abstract:Recovering a user-special and controllable human model from a single RGB image is a nontrivial challenge. Existing methods usually generate static results with an image consistent subject's pose. Our work aspires to achieve pose-controllable human reconstruction from a single image by learning a dynamic (multi-pose) implicit field. We first construct a feature-embedded human model (FEHM) as a bridge to propagate image features to different pose spaces. Based on FEHM, we then encode three pose-decoupled features. Global image features represent user-specific shapes in images and replace widely used pixel-aligned ways to avoid unwanted shape-pose entanglement. Spatial color features propagate FEHM-embedded image cues into 3D pose space to provide spatial high-frequency guidance. Spatial geometry features improve reconstruction robustness by using the surface shape of the FEHM as the prior. Finally, new implicit functions are designed to predict the dynamic human implicit fields. For effective supervision, a realistic human avatar dataset, SimuSCAN, with 1000+ models is constructed using a low-cost hierarchical mesh registration method. Extensive experiments demonstrate that our method achieves the state-of-the-art reconstruction level.
computer science, software engineering
What problem does this paper attempt to address?