3D Human Pose Estimation = 2D Pose Estimation + Matching

Ching-Hang Chen,Deva Ramanan

DOI: https://doi.org/10.48550/arXiv.1612.06524

2017-04-11

Abstract:We explore 3D human pose estimation from a single RGB image. While many approaches try to directly predict 3D pose from image measurements, we explore a simple architecture that reasons through intermediate 2D pose predictions. Our approach is based on two key observations (1) Deep neural nets have revolutionized 2D pose estimation, producing accurate 2D predictions even for poses with self occlusions. (2) Big-data sets of 3D mocap data are now readily available, making it tempting to lift predicted 2D poses to 3D through simple memorization (e.g., nearest neighbors). The resulting architecture is trivial to implement with off-the-shelf 2D pose estimation systems and 3D mocap libraries. Importantly, we demonstrate that such methods outperform almost all state-of-the-art 3D pose estimation systems, most of which directly try to regress 3D pose from 2D measurements.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to estimate 3D human poses from a single RGB image. Specifically, the paper proposes a method to infer 3D human poses through intermediate 2D pose prediction in order to overcome the challenges of predicting 3D poses directly from image measurements. This method is based on two key observations: 1. Deep neural networks have revolutionized 2D pose estimation and can produce accurate 2D predictions even in the case of self - occlusion. 2. A large number of 3D motion - capture datasets are now available, which makes it possible to lift the predicted 2D poses to 3D through simple memorization methods such as the nearest - neighbor algorithm. By combining the existing 2D pose estimation system and the 3D motion - capture library, the paper constructs a simple and effective architecture for estimating 3D human poses from a single RGB image. The experimental results show that this method outperforms the existing 3D pose estimation systems in most cases, which usually attempt to regress 3D poses directly from 2D measurements.

3D Human Pose Estimation = 2D Pose Estimation + Matching

Motion Imitation of a Humanoid Robot Via Pose Estimation

Robust Estimation of 3D Human Poses from a Single Image

Robust 3D Human Pose Estimation from Single Images or Video Sequences

Marker-Less 3d Human Motion Capture With Monocular Image Sequence And Height-Maps

3D Human Pose Estimation from Deep Multi-View 2D Pose

Dual-view 3D Human Pose Estimation Without Camera Parameters for Action Recognition

Monocular 3D Human Pose Estimation by Predicting Depth on Joints

A Single 2D Pose with Context is Worth Hundreds for 3D Human Pose Estimation

Motion Capture Research: 3D Human Pose Recovery Based on RGB Video Sequences

Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose

Self-supervised 3D Human Pose Estimation from a Single Image

Moulding Humans: Non-parametric 3D Human Shape Estimation from Single Images

ElePose: Unsupervised 3D Human Pose Estimation by Predicting Camera Elevation and Learning Normalizing Flows on 2D Poses

Decoupled High-Dimensional Spatial Pose-Block for 3D Human Pose Estimation

Multi-person 3D pose estimation from unlabelled data

Sparseness Meets Deepness: 3D Human Pose Estimation from Monocular Video

Rgb-D Fusion for Point-Cloud-Based 3d Human Pose Estimation

Locally Connected Network for Monocular 3D Human Pose Estimation

Learning to Estimate 3D Human Pose and Shape from a Single Color Image

Multi-Person 3d Pose Estimation From Monocular Image Sequences