HKE-GCN: Heatmaps-guided Keypoints Encoder and Graph Convolutional Network for Human Pose Estimation

Han Xia,Yiran Wang,Xiaoru Wang,Songkai Xiong,Zhihong Yu
DOI: https://doi.org/10.1109/ijcnn55064.2022.9892251
2022-01-01
Abstract:Multi-person pose estimation is a challenging task which aims to locate keypoints for multiple persons. Graph convolutional network can effectively capture the semantic relationship among keypoints according to the kinematic structure of the human body, which is beneficial to locate keypoints but is the lack of ability of most CNN-based models. However, existing GCN-based methods mostly flatten the 2D features directly to obtain 1D embeddings, leading to the redundant information in keypoints embeddings, large size of keypoints embeddings, and high computation cost. To address these problems, we propose a two-stage framework based on Heatmaps-guided Keypoints Encoder and graph convolutional network, called HKE-GCN. The first stage uses a heatmaps-based network to predict the heatmaps of keypoints, then the second stage refines the prediction of the first stage. The second stage consists of two modules: Heatmaps-guided Keypoints Encoder (HKE) and Graph-based Refinement Module (GRM), which are used to generate keypoints embeddings according to the guidance of heatmaps and explicitly learn the relationship among keypoints based on GCN, respectively. Experiments show our framework is model-agnostic and our proposed modules are effective and lightweight. Our best model achieves state-of-the-art 76.4AP on COCO test-dev.
What problem does this paper attempt to address?