Abstract:It remains challenging how to acquire a human body shape with high precision and evaluate the reconstructed models effectively, because the results can be easily affected by various factors (e.g., the performance of the capture device, the unwanted movement of the subject, and the self-occlusion of the articulated body structure). To tackle the above challenges, this research presents a passive acquisition system, which comprises 60 spatially-configured Digital Single Lens Reflex (DSLR) cameras and a carefully devised algorithmic pipeline for shape acquisition in a single shot. Different from traditional multi-view stereo solutions, the constituent cameras are synchronized and organized into 30 binocular stereo rigs to capture images from multiple views simultaneously. Each binocular stereo rig is regarded as a depth sensor. The acquisition pipeline consists of three stages. First, camera calibration is performed to estimate intrinsic and extrinsic parameters of all cameras, especially for paired binocular cameras. Second, depth inference based on stereo matching is employed to recover reliable depth information from RGB images. A novel hierarchical seed-propagation stereo matching framework is proposed, resulting in 30 dense and uniform-distributed partial point clouds. Finally, a point-based geometry processing step composed of multi-view registration and surface meshing is carried out to obtain high-quality watertight human body shapes. This research also proposes an elaborate and novel method to assess the accuracy of reconstructed non-rigid human body model based on anthropometry parameters, which solves the synchronization of the ground-truth values and the measured values. Experimental results show that the system can achieve the reconstruction accuracy within 2.5 mm in average. (C) 2020 Elsevier Ltd. All rights reserved.

Synthetic Training for Monocular Human Mesh Recovery

Learning Monocular Mesh Recovery of Multiple Body Parts Via Synthesis

Human Mesh Recovery from Monocular Images via a Skeleton-disentangled Representation

Recovering 3D Human Mesh from Monocular Images: A Survey

Marker-Less 3d Human Motion Capture With Monocular Image Sequence And Height-Maps

High-precision Human Body Acquisition Via Multi-View Binocular Stereopsis

MH‐HMR: Human mesh recovery from monocular images via multi‐hypothesis learning

Human Mesh Recovery from Arbitrary Multi-view Images

Monocular Expressive 3D Human Reconstruction of Multiple People

Monocular 3D Human Pose Estimation In The Wild Using Improved CNN Supervision

Temporally Coherent Full 3D Mesh Human Pose Recovery from Monocular Video

Synthetic Depth Transfer for Monocular 3D Object Pose Estimation in the Wild.

Monocular Real-time Full Body Capture with Inter-part Correlations

CenterHMR: Multi-Person Center-based Human Mesh Recovery

Self-supervised 3D Human Mesh Recovery from Noisy Point Clouds

MUG: Multi-human Graph Network for 3D Mesh Reconstruction from 2D Pose

End-to-End Hand Mesh Recovery from a Monocular RGB Image

Monocular Human Pose and Shape Reconstruction Using Part Differentiable Rendering

End-to-end Recovery of Human Shape and Pose

3D Hand Mesh Recovery from Monocular RGB in Camera Space

Synthetic Training for Accurate 3D Human Pose and Shape Estimation in the Wild