Abstract:Training an accurate 3D human pose estimator often requires a large amount of 3D ground-truth data which is inefficient and costly to collect. Previous methods have either resorted to weakly supervised methods to reduce the demand of ground-truth data for training, or using synthetically-generated but photo-realistic samples to enlarge the training data pool. Nevertheless, the former methods mainly require either additional supervision, such as unpaired 3D ground-truth data, or the camera parameters in multiview settings. On the other hand, the latter methods require accurately textured models, illumination configurations and background which need careful engineering. To address these problems, we propose a domain adaptation framework with unsupervised knowledge transfer, which aims at leveraging the knowledge in multi-modality data of the easy-to-get synthetic depth datasets to better train a pose estimator on the real-world datasets. Specifically, the framework first trains two pose estimators on synthetically-generated depth images and human body segmentation masks with full supervision, while jointly learning a human body segmentation module from the predicted 2D poses. Subsequently, the learned pose estimator and the segmentation module are applied to the real-world dataset to unsupervisedly learn a new RGB image based 2D/3D human pose estimator. Here, the knowledge encoded in the supervised learning modules are used to regularize a pose estimator without ground-truth annotations. Comprehensive experiments demonstrate significant improvements over weakly supervised methods when no ground-truth annotations are available. Further experiments with ground-truth annotations show that the proposed framework can outperform state-of-the-art fully supervised methods. In addition, we conducted ablation studies to examine the impact of each loss term, as well as with different amount of supervisions signal.

Hardmining Training Via Self-Adversarial Network for Human Pose Estimation

Exploring Hard Joints Mining Via Hourglass-Based Generative Adversarial Network for Human Pose Estimation

STN-enhanced Message Passing Guided by Adversarial Learning for Human Pose Estimation

Improved Multi-Person 2D Human Pose Estimation Using Attention Mechanisms and Hard Example Mining

Multi-Person Pose Estimation in the Wild: Using Adversarial Method to Train a Top-Down Pose Estimation Network

3D Human Pose Estimation Based on 2D-3D Consistency with Synchronized Adversarial Training

Enhancement and Optimisation of Human Pose Estimation with Multi-Scale Spatial Attention and Adversarial Data Augmentation

An Adaptive Human Posture Detection Algorithm Based on Generative Adversarial Network

3D human pose estimation based on 2D–3D consistency with synchronized adversarial training

Unsupervised Domain Adaptation for 3D Human Pose Estimation

Geometry-Driven Self-Supervised Method for 3D Human Pose Estimation

An adversarial human pose estimation network injected with graph structure

Adversarial Learning Enhancement for 3D Human Pose and Shape Estimation

Weakly Supervised Adversarial Learning for 3D Human Pose Estimation from Point Clouds

Adversarial Semantic Data Augmentation for Human Pose Estimation

3D Human Pose Estimation with Adversarial Learning

3D Human Pose Estimation with Generative Adversarial Networks

Human Pose Estimation Based On Secondary Generation Adversary

Self-supervised Method for 3D Human Pose Estimation with Consistent Shape and Viewpoint Factorization.

Skeleton-aware Graph-based Adversarial Networks for Human Pose Estimation from Sparse IMUs

Human Pose Estimation Based on Step Deep Convolution Neural Network.