Abstract:The basic body shape of a person does not change within a single video. However, most SOTA human mesh estimation (HME) models output a slightly different body shape for each video frame, which results in inconsistent body shapes for the same person. In contrast, we leverage anthropometric measurements like tailors are already obtaining from humans for centuries. We create a model called A2B that converts such anthropometric measurements to body shape parameters of human mesh models. Moreover, we find that finetuned SOTA 3D human pose estimation (HPE) models outperform HME models regarding the precision of the estimated keypoints. We show that applying inverse kinematics (IK) to the results of such a 3D HPE model and combining the resulting body pose with the A2B body shape leads to superior and consistent human meshes for challenging datasets like ASPset or fit3D, where we can lower the MPJPE by over 30 mm compared to SOTA HME models. Further, replacing HME models estimates of the body shape parameters with A2B model results not only increases the performance of these HME models, but also leads to consistent body shapes.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve **the problem of inconsistent human body shapes generated by the Human Mesh Estimation (HME) model in videos**. Specifically: 1. **Problem background**: - When processing videos, the HME model usually generates a slightly different human body shape for each frame, even if these frames are consecutive actions of the same person. This results in inconsistent body shapes of the same person in the same video. - This inconsistency is especially more obvious in scenarios with rapidly changing postures such as sports, which seriously affects the accuracy of 3D pose and shape estimation. 2. **Limitations of existing methods**: - Currently, most HME models are trained based on a single image and cannot handle the entire video sequence, so it is difficult to maintain the consistency of the body shape of the same person in different frames. - Similar inconsistency problems also exist in existing 3D pose and mesh datasets, which further affect the performance of the model. 3. **Solutions**: - The paper proposes a model named A2B (Anthropometric to Body shape). This model uses anthropometric data (such as the measurement data used by tailors) to generate consistent and accurate human body shape parameters. - The A2B model converts anthropometric data into shape parameters of common human body mesh models such as SMPL - X, ensuring that the body shape of the same person remains consistent in all frames. - In addition, the paper also combines an improved 3D Human Pose Estimation (HPE) model and Inverse Kinematics (IK) to improve the accuracy of pose estimation and combines it with the body shape generated by A2B to generate a more accurate and consistent human body mesh. 4. **Experimental verification**: - The authors conducted experiments on challenging datasets such as ASPset and fit3D. The results show that the body shape parameters generated by using the A2B model can significantly reduce the MPJPE (Mean Per Joint Position Error) and improve the overall performance. ### Summary The main goal of this paper is to solve the problem of inconsistent human body shapes generated by the HME model in videos by introducing the A2B model and an improved pose estimation method, thereby improving the accuracy and consistency of 3D human body mesh estimation.

Leveraging Anthropometric Measurements to Improve Human Mesh Estimation and Ensure Consistent Body Shapes

A Semantic Parametric Model for 3D Human Body Reshaping.

Im2Fit: Fast 3D Model Fitting and Anthropometrics using Single Consumer Depth Camera and Synthetic Data

Estimation of Human Body Shape and Posture Under Clothing

Measurements-to-body: 3D human body reshaping based on anthropometric measurements

3D Human Body Reshaping with Anthropometric Modeling

LatentHuman: Shape-and-Pose Disentangled Latent Representation for Human Bodies

Accurate 3D Body Shape Regression using Metric and Semantic Attributes

Estimating 3D body mesh without SMPL annotations via alternating successive convex approximation

3D Body Shapes Estimation from Dressed-Human Silhouettes.

High-precision Human Body Acquisition Via Multi-View Binocular Stereopsis

Personalized 3D Human Pose and Shape Refinement

HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation

The Best of Both Worlds: Combining Model-based and Nonparametric Approaches for 3D Human Body Estimation

General Automatic Human Shape and Motion Capture Using Volumetric Contour Cues

3D Human Pose Estimation Based on Wearable IMUs and Multiple Camera Views

TailorMe: Self-Supervised Learning of an Anatomically Constrained Volumetric Human Shape Model

Subject-Specific Human Modeling for Human Pose Estimation

TailorMe: Self‐Supervised Learning of an Anatomically Constrained Volumetric Human Shape Model

Detailed Human Shape Estimation From A Single Image By Hierarchical Mesh Deformation

Towards Accurate Markerless Human Shape and Pose Estimation over Time