Abstract:Human pose estimation is aimed at locating the anatomical parts or keypoints of the human body and is regarded as a core component in obtaining detailed human understanding in images or videos. However, the occlusion and overlap upon human bodies and complex backgrounds often result in implausible pose predictions. To address the problem, we propose a structure-aware adversarial framework, which combines cues of local joint interconnectivity and priors about the holistic structure of human bodies, achieving high-quality results for multiperson human pose estimation. Effective learning of such cues and priors is typically a challenge. The presented framework uses a nonparametric representation, which is referred to as the Keypoint Biorientation Field (KBOF), to learn orientation cues of joint interinteractivity in the image, just as human vision can explore geometric constraints of joint interconnectivity. Additionally, a module using multiscale feature representation with inflated convolution for joint heatmap detection and Keypoint Biorientation Field detection is applied in our framework to fully explore the local features of joint points and the bidirectional connectivity between them at the microscopic level. Finally, we employ improving generative adversarial networks which use KBOF and multiscale feature extraction that implicitly leverages the cues and priors about the structure of human bodies for global structural inference. The adversarial network enables our framework to combine information about the connections between local body joints at the microscopic level and the structural priors of the human body at the global level, thus enhancing the performance of our framework. The effectiveness and robustness of the network are evaluated on the task of human pose prediction in two widely used benchmark datasets, i.e., MPII and COCO datasets. Our approach outperforms the state-of-the-art methods, especially in the case of complex scenes. Our method achieves an improvement of 2.6% and 1.7% compared to the latest method on the MPII test set and COCO validation set, respectively.

Enhancement and Optimisation of Human Pose Estimation with Multi-Scale Spatial Attention and Adversarial Data Augmentation

Human Pose Estimation Based on Feature Enhancement and Multi-Scale Feature Fusion

Multi-Scale Structure-Aware Network for Human Pose Estimation

MSPENet: Multi-Scale Adaptive Fusion and Position Enhancement Network for Human Pose Estimation

STN-enhanced Message Passing Guided by Adversarial Learning for Human Pose Estimation

Joint Data Augmentation and Attention Mechanism for Occluded Human Pose Estimation

Multi-Person Pose Estimation in the Wild: Using Adversarial Method to Train a Top-Down Pose Estimation Network

Human Pose Estimation Based on Lightweight Multi-Scale Coordinate Attention

Improving Human Pose Estimation Based on Stacked Hourglass Network

Improved Multi-Person 2D Human Pose Estimation Using Attention Mechanisms and Hard Example Mining

Full-Resolution Encoder-Decoder Networks with Multi-Scale Feature Fusion for Human Pose Estimation

Research on Multi-level Attention-based Human Pose Estimation

A Structure-Aware Adversarial Framework with the Keypoint Biorientation Field for Multiperson Pose Estimation

CFENet: Content-aware Feature Enhancement Network for Multi-Person Pose Estimation

Multi-Scale Adaptive Structure Network For Human Pose Estimation From Color Images

Complementary Feature Pyramid Network for Human Pose Estimation

Optimized S2E Attention Block based Convolutional Network for Human Pose Estimation

Combining detailed appearance and multi-scale representation: a structure-context complementary network for human pose estimation

Improved Modular Convolution Neural Network for Human Pose Estimation

A Multi-Level Network for Human Pose Estimation

Human Pose Estimation Based on Parallel Atrous Convolution and Body Structure Constraints