Abstract:Multi-person pose estimation has been an increasingly popular topic with the advancements of all kinds of computer vision and human-machine interaction tasks. This study field could further enhance the understanding of human poses and activities. The current mainstream multi-person pose estimation methods are generally divided into two categories: top-down and bottom-up methods. Although top-down methods are capable of achieving better performance by simplifying the problem to single-person pose estimation, while this strategy somewhat greatly increases the time complexity as a trade-off for better accuracy. The bottom-up methods could directly locate all the keypoints in the image, which can be potentially more effective and can be made real-time. However, most of the current bottom-up methods have separated the detection and grouping of keypoints into two independent steps. This greatly hindered the overall performance and computation efficiency of the algorithms. To address this issue, our study proposes an end-to-end bottom-up framework for multi-person pose estimation. Using the HRNet as the backbone structure, we add a deconvolution module to acquire high-resolution feature maps in the keypoints proposal stage. The graph neural network is leveraged in the grouping stage, which is integrated to the backbone so that the whole framework can be trained in an end-to-end manner. Using the keypoint candidates as nodes, two discriminators are exploited to supervise the grouping process. Lastly, a graph-based pose optimization algorithm is explored to refine the results. Experiments on the COCO and CrowdPose datasets show that our method achieves better accuracy and greatly reduce the computation time as well.

An Improved Human Pose Estimation Model Based on DEKR

Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression

DEKRV2: More Accurate or Fast Than DEKR.

X-HRNet: Towards Lightweight Human Pose Estimation with Spatially Unidimensional Self-Attention

A Deconvolutional Bottom-up Deep Network for Multi-Person Pose Estimation.

DecenterNet: Bottom-Up Human Pose Estimation Via Decentralized Pose Representation

Densely Connected Attentional Pyramid Residual Network for Human Pose Estimation.

Human pose estimation in crowded scenes using Keypoint Likelihood Variance Reduction

Multi-person Pose Estimation Based on Graph Grouping Optimization

Towards Accurate Human Pose Estimation in Videos of Crowded Scenes

Human Pose Estimation Model Based on DiracNets and Integral Pose Regression.

LiteDEKR: End‐to‐end Lite 2D Human Pose Estimation Network

Bottom-Up Human Pose Estimation Based on Multiple Anchor Points Regression

Multi-Stage HRNet: Multiple Stage High-Resolution Network for Human Pose Estimation

Bottom-Up Human Pose Estimation by Ranking Heatmap-Guided Adaptive Keypoint Estimates

DGCN: Dynamic Graph Convolutional Network for Efficient Multi-Person Pose Estimation

Towards High Performance One-Stage Human Pose Estimation.

DHRNet: A Dual-Path Hierarchical Relation Network for Multi-Person Pose Estimation

JCN: Joint Constraint-Based Human Pose Refinement Networks

Hierarchical Associative Encoding and Decoding for Bottom-Up Human Pose Estimation

HPRNet: Hierarchical Point Regression for Whole-Body Human Pose Estimation