Abstract:Using video sequences to restore 3D human poses is of great significance in the field of motion capture. This paper proposes a novel approach to estimate 3D human action via end-to-end learning of deep convolutional neural network to calculate the parameters of the parameterized skinned multi-person linear model. The method is divided into two main stages: (1) 3D human pose estimation based on a single frame image. We use 2D/3D skeleton point constraints, human height constraints, and generative adversarial network constraints to obtain a more accurate human-body model. The model is pre-trained using open-source human pose datasets; (2) Human-body pose generation based on video streams. Combined with the correlation of video sequences, a 3D human pose recovery method based on video streams is proposed, which uses the correlation between videos to generate a smoother 3D pose. In addition, we compared the proposed 3D human pose recovery method with the commercial motion capture platform to prove the effectiveness of the proposed method. To make a contrast, we first built a motion capture platform through two Kinect (V2) devices and iPi Soft series software to obtain depth-camera video sequences and monocular-camera video sequences respectively. Then we defined several different tasks, including the speed of the movements, the position of the subject, the orientation of the subject, and the complexity of the movements. Experimental results show that our low-cost method based on RGB video data can achieve similar results to commercial motion capture platform with RGB-D video data.

An Effective 3D Human Pose Estimation Method Based on Dilated Convolutions for Videos.

Exploring Severe Occlusion: Multi-Person 3D Pose Estimation with Gated Convolution.

Motion Imitation of a Humanoid Robot Via Pose Estimation

Efficient Multi-person Hierarchical 3D Pose Estimation for Autonomous Driving

Enhanced 3D Human Pose Estimation from Videos by Using Attention-Based Neural Network with Dilated Convolutions

3D Human pose estimation from video via multi-scale multi-level spatial temporal features

Robust 3D Human Pose Estimation from Single Images or Video Sequences

Robust Estimation of 3D Human Poses from a Single Image

Unsupervised Universal Hierarchical Multi-Person 3D Pose Estimation for Natural Scenes

Towards Accurate Human Pose Estimation in Videos of Crowded Scenes

3D Human Pose Estimation with Spatial and Temporal Transformers

3D Human Pose Estimation using Spatio-Temporal Networks with Explicit Occlusion Training

3D human pose estimation in video with temporal convolutions and semi-supervised training

Geometry-Driven Self-Supervised Method for 3D Human Pose Estimation

Motion Capture Research: 3D Human Pose Recovery Based on RGB Video Sequences

Unified End-to-End YOLOv5-HR-TCM Framework for Automatic 2D/3D Human Pose Estimation for Real-Time Applications

APP: Adaptive Pose Pooling for 3D Human Pose Estimation from Videos

3D Human Pose and Shape Estimation with Dense Correspondence from a Single Depth Image

Towards Precise 3D Human Pose Estimation with Multi-Perspective Spatial-Temporal Relational Transformers

Learning Dynamical Human-Joint Affinity for 3D Pose Estimation in Videos

Towards Accurate Markerless Human Shape and Pose Estimation over Time