3D Human Pose Estimation with Multi-Scale Graph Convolution and Hierarchical Body Pooling

Huang Ke,Sui TianQi,Wu Hong
DOI: https://doi.org/10.1007/s00530-021-00808-3
IF: 3.9
2021-01-01
Multimedia Systems
Abstract:Since human pose can be naturally represented by a graph, graph convolutional networks (GCNs) have recently been proposed for 3D human pose estimation and achieved promising results. But most GCN-based methods use vanilla graph convolution which aggregates features of 1-hop neighbors and long-range dependencies between joints can only be captured by stacking multiple layers of graph convolution. To alleviate this problem, we propose a multi-scale graph convolution to aggregate features of neighbors at different distances and apply it to nodes with specified neighbor types. We further propose a hierarchical-body-pooling to aggregate and share body-level and body-part-level context information. Based on these components, we finally develop a light-weighted GCN for 3D pose lifting by repeatedly stacking a residual block of multi-scale graph convolution and a hierarchical-body-pooling layer. The experimental results on Human3.6M dataset indicate that our network can achieve state-of-the-art performance with much less model complexity.
What problem does this paper attempt to address?