Abstract:Graph convolution networks (GCNs) based methods for 3D human pose estimation usually aggregate immediate features of single-hop nodes, which are unaware of the correlation of multi-hop nodes and therefore neglect long-range dependency for predicting complex poses. In addition, they typically operate either on single-scale or sequential down-sampled multi-scale graph representations, resulting in the loss of contextual information or spatial details. To address these problems, this paper proposes a parallel hop-aware graph attention network (PHGANet) for 3D human pose estimation, which learns enriched hop-aware correlation of the skeleton joints while maintaining the spatially-precise representations of the human graph. Specifically, we propose a hop-aware skeletal graph attention (HSGAT) module to capture the semantic correlation of multi-hop nodes, which first calculates skeleton-based 1-hop attention and then disseminates it to arbitrary hops via graph connectivity. To alleviate the redundant noise introduced by the interactions with distant nodes, HSGAT uses an attenuation strategy to separate attention from distinct hops and assign them learnable attenuation weights according to their distances adaptively. Upon HSGAT, we further build PHGANet with multiple parallel branches of stacked HSGAT modules to learn the enriched hop-aware correlation of human skeletal structures at different scales. In addition, a joint centrality encoding scheme is proposed to introduce node importance as a bias in the learned graph representation, which makes the core joints (e.g., neck and pelvis) more influential during node aggregation. Experimental results indicate that PHGANet performs favorably against state-of-the-art methods on the Human3.6M and MPI-INF-3DHP benchmarks. Models and code are available at https://github.com/ChenyangWang95/PHGANet/.

Three-dimensional human pose estimation based on improved semantic graph convolution neural networks

Exploring Severe Occlusion: Multi-Person 3D Pose Estimation with Gated Convolution.

3D Human Pose Estimation Using Improved Semantic Graph Convolutional Based on Fusing Non-local Neural Network and Multi-Head Attention

Semantic Graph Convolutional Networks for 3D Human Pose Regression

MSMB-GCN: Multi-scale Multi-branch Fusion Graph Convolutional Networks for 3D Human Pose Estimation

3D Hand Pose Estimation Using Semantic Dynamic Hypergraph Convolutional Networks

A residual semantic graph convolutional network with high-resolution representation for 3D human pose estimation in a virtual fashion show

SPGformer: Serial–Parallel Hybrid GCN-Transformer With Graph-Oriented Encoder for 2-D-to-3-D Human Pose Estimation

SPGformer: Serial-Parallel Hybrid GCN-Transformer with Graph-Oriented Encoder for 2D-to-3d Human Pose Estimation

3D Human Pose Estimation Via Graph Extended Spatio-Temporal Convolutional Network

Semi-Dynamic Hypergraph Neural Network for 3D Pose Estimation

Relation-balanced graph convolutional network for 3D human pose estimation

HSGNet: hierarchically stacked graph network with attention mechanism for 3D human pose estimation

GLA-GCN: Global-local Adaptive Graph Convolutional Network for 3D Human Pose Estimation from Monocular Video

Correction To: Learning Enriched Hop-Aware Correlation for Robust 3D Human Pose Estimation

3D Human Pose Estimation Via Human Structure-Aware Fully Connected Network

Simplified-attention Enhanced Graph Convolutional Network for 3D human pose estimation

HPGCN: Hierarchical Poselet-Guided Graph Convolutional Network for 3D Pose Estimation

Graph and Skipped Transformer: Exploiting Spatial and Temporal Modeling Capacities for Efficient 3D Human Pose Estimation

Optimizing Network Structure for 3D Human Pose Estimation.

High-order local connection network for 3D human pose estimation based on GCN