Abstract:Reconstructing 3D human shape and pose from monocular images is challenging despite the promising results achieved by the most recent learning-based methods. The commonly occurred misalignment comes from the facts that the mapping from images to the model space is highly non-linear and the rotation-based pose representation of body models is prone to result in the drift of joint positions. In this work, we investigate learning 3D human shape and pose from dense correspondences of body parts and propose a Decompose-and-aggregate Network (DaNet) to address these issues. DaNet adopts the dense correspondence maps, which densely build a bridge between 2D pixels and 3D vertices, as intermediate representations to facilitate the learning of 2D-to-3D mapping. The prediction modules of DaNet are decomposed into one global stream and multiple local streams to enable global and fine-grained perceptions for the shape and pose predictions, respectively. Messages from local streams are further aggregated to enhance the robust prediction of the rotation-based poses, where a position-aided rotation feature refinement strategy is proposed to exploit spatial relationships between body joints. Moreover, a Part-based Dropout (PartDrop) strategy is introduced to drop out dense information from intermediate representations during training, encouraging the network to focus on more complementary body parts as well as neighboring position features. The efficacy of the proposed method is validated on both indoor and real-world datasets including Human3.6M, UP3D, COCO, and 3DPW, showing that our method could significantly improve the reconstruction performance in comparison with previous state-of-the-art methods. Our code is publicly available at <a class="link-external link-https" href="https://hongwenzhang.github.io/dense2mesh" rel="external noopener nofollow">this https URL</a> .

Deep Layer and Spatial Aggregation neural network for human pose estimation

Densely Connected Attentional Pyramid Residual Network for Human Pose Estimation.

Human Pose Estimation Using Deep Structure Guided Learning.

Deep Dual Consecutive Network for Human Pose Estimation

Implicit Decouple Network for Efficient Pose Estimation

Multi-person pose estimation using atrous convolution

SPCNet:Spatial Preserve and Content-aware Network for Human Pose Estimation

Learning 3D Human Shape and Pose from Dense Body Parts

LSDNet: lightweight stochastic depth network for human pose estimation

EANet: Towards Lightweight Human Pose Estimation With Effective Aggregation Network

A Deconvolutional Bottom-up Deep Network for Multi-Person Pose Estimation.

AMANet: Adaptive Multi-Path Aggregation for Learning Human 2D-3D Correspondences

DRSI-Net: Dual-Residual Spatial Interaction Network for Multi-Person Pose Estimation

Deep High-Resolution Representation Learning For Human Pose Estimation

Complementary Feature Pyramid Network for Human Pose Estimation

Multi-Scale Structure-Aware Network for Human Pose Estimation

A Deep Structure for Human Pose Estimation

SLBNet: Shallow and Lightweight Bilateral Network for Pose Estimation

DSPNet: A low computational-cost network for human pose estimation

An improved lightweight high-resolution network based on multi-dimensional weighting for human pose estimation

Stacked Hourglass Networks for Human Pose Estimation