Abstract:Whole-body pose estimation aims to regress human pose models that include the body, hand, and facial details from RGB images. While the task of whole-body mesh recovery has been extensively studied in recent literature, the focus has predominantly been on human mesh recovery for a single person, despite the frequent occurrence of multiple people in practical scenarios. Similar to body-only cases, such single-person whole-body pose estimation methods often fail in the multiple-people problem for two reasons: (i) Given the ambiguous bounding box, which could contain more than one instance, it is difficult for single-person-oriented methods to regress the body mesh model of the target person. (ii) Single-person pose estimation approaches neglect the person-person occlusions and the depth order among instances, thus generating interpenetrated models. In this paper, we propose the Multi-person Expressive POse (MEPO) model, which exploits expressive 3D human model reconstruction for multiple people. To our best knowledge, our model is the first multi-person whole-body mesh reconstruction model, which is intensified by heatmap, depthmap, and depth order loss. We propose the Heatmap Enhancement Net (HENet) to leverage the heatmap information to assist the model in concentrating on the target person in crowded multi-person cases, while the depthmap delivers depth information of the image. Furthermore, we impose a depth order loss to recover human mesh precisely for overlapped people. In our experiments, we evaluate our model on multiple challenging datasets, including AGORA, which consists of complex occlusions similar to real-world scenarios. Our method has a significant performance improvement compared with the state-of-the-art pose estimation methods.

CenterHMR: Multi-Person Center-based Human Mesh Recovery

Exploring Severe Occlusion: Multi-Person 3D Pose Estimation with Gated Convolution.

MH‐HMR: Human mesh recovery from monocular images via multi‐hypothesis learning

Monocular Expressive 3D Human Reconstruction of Multiple People

End-to-end Recovery of Human Shape and Pose

Human Mesh Recovery from Arbitrary Multi-view Images

Monocular, One-stage, Regression of Multiple 3D People

Human Mesh Recovery from Monocular Images via a Skeleton-disentangled Representation

Visibility-Aware Human Mesh Recovery Via Balancing Dense Correspondence and Probability Model

Unsupervised Universal Hierarchical Multi-Person 3D Pose Estimation for Natural Scenes

VoteHMR: Occlusion-Aware Voting Network for Robust 3D Human Mesh Recovery from Partial Point Clouds

Marker-Less 3d Human Motion Capture With Monocular Image Sequence And Height-Maps

Synthetic Training for Monocular Human Mesh Recovery

Recovering 3D Human Mesh from Monocular Images: A Survey

Multi-RoI Human Mesh Recovery with Camera Consistency and Contrastive Losses

Center point to pose: Multiple views 3D human pose estimation for multi-person

Coherent Reconstruction of Multiple Humans from a Single Image

Multi-hypotheses Conditioned Point Cloud Diffusion for 3D Human Reconstruction from Occluded Images

MUG: Multi-human Graph Network for 3D Mesh Reconstruction from 2D Pose

Dynamic Multi-Person Mesh Recovery From Uncalibrated Multi-View Cameras

PC-HMR: Pose Calibration for 3D Human Mesh Recovery from 2D Images/Videos