Abstract:Reconstructing 3D human shape and pose from monocular images is challenging despite the promising results achieved by the most recent learning-based methods. The commonly occurred misalignment comes from the facts that the mapping from images to the model space is highly non-linear and the rotation-based pose representation of body models is prone to result in the drift of joint positions. In this work, we investigate learning 3D human shape and pose from dense correspondences of body parts and propose a Decompose-and-aggregate Network (DaNet) to address these issues. DaNet adopts the dense correspondence maps, which densely build a bridge between 2D pixels and 3D vertices, as intermediate representations to facilitate the learning of 2D-to-3D mapping. The prediction modules of DaNet are decomposed into one global stream and multiple local streams to enable global and fine-grained perceptions for the shape and pose predictions, respectively. Messages from local streams are further aggregated to enhance the robust prediction of the rotation-based poses, where a position-aided rotation feature refinement strategy is proposed to exploit spatial relationships between body joints. Moreover, a Part-based Dropout (PartDrop) strategy is introduced to drop out dense information from intermediate representations during training, encouraging the network to focus on more complementary body parts as well as neighboring position features. The efficacy of the proposed method is validated on both indoor and real-world datasets including Human3.6M, UP3D, COCO, and 3DPW, showing that our method could significantly improve the reconstruction performance in comparison with previous state-of-the-art methods. Our code is publicly available at <a class="link-external link-https" href="https://hongwenzhang.github.io/dense2mesh" rel="external noopener nofollow">this https URL</a> .

LatentHuman: Shape-and-Pose Disentangled Latent Representation for Human Bodies

Implicit Neural Representations With Structured Latent Codes for Human Body Modeling

Self Supervised Networks for Learning Latent Space Representations of Human Body Scans and Motions

Unsupervised Shape and Pose Disentanglement for 3D Meshes

Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows

Dynamic Human Body Reconstruction and Motion Tracking with Low-Cost Depth Cameras

Disentangled Human Body Embedding Based on Deep Hierarchical Neural Network

3D Human Pose and Shape Estimation with Dense Correspondence from a Single Depth Image

Topology-Preserved Human Reconstruction with Details

Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans

Neural Descent for Visual 3D Human Pose and Shape

Human Mesh Recovery from Monocular Images via a Skeleton-disentangled Representation

Neural Body Fitting: Unifying Deep Learning and Model-Based Human Pose and Shape Estimation

Deep3DPose: Realtime Reconstruction of Arbitrarily Posed Human Bodies from Single RGB Images

Identity-Disentangled Neural Deformation Model for Dynamic Meshes

H4D: Human 4D Modeling by Learning Neural Compositional Representation

Towards Accurate Markerless Human Shape and Pose Estimation over Time

Learning 3D Human Shape and Pose from Dense Body Parts

Combining Implicit Function Learning and Parametric Models for 3D Human Reconstruction

Parametric Human Shape Reconstruction Via Bidirectional Silhouette Guidance

Personalized 3D Human Pose and Shape Refinement