Three-dimensional human pose estimation based on improved semantic graph convolution neural networks

Chengkun Yang,Min Guo,Miao Ma
DOI: https://doi.org/10.1117/1.JEI.31.6.063044
IF: 0.829
2022-01-01
Journal of Electronic Imaging
Abstract:Graph convolutional neural networks (GCN) have been widely applied to many real-world problems and have achieved better results. To improve the accuracy of GCN in the three-dimensional human pose estimation (3D HPE) task, improved semantic GCNs (SimpreRxSkip-NonCam-SemGCN and SPRA-SemGCN) are proposed. For the problem of insufficiently utilizing effective information in 3D HPE tasks, the dual- layer attention mechanism module (NonCam-attention) of extracting information is proposed. It establishes a spatial and channel-based fusion structure model by capturing the distance dependencies between different joint nodes of human poses and the associated features between different mapping channels. To reduce the training time and GPU memory for the preaggregated semantic graph convolutional networks (Pre-SemGCN) model, the simplified Pre-SemGCN module (Simpre-SemGCN) is suggested, which can accelerate the training speed of the model by eliminating the nonlinear activation layer. Further, a nested recursive expansion-based residual connection module (RxSkip + LN) is proposed to adjust the ratio of input and output with fewer parameters. To verify the effectiveness of the proposed model, we conduct experiments on the dataset Human 3.6M. The results show that the SPRA-SemGCN model proposed can effectively reduce the error of the 3D HPE task, and the mean per joint position error is finally reduced to 37.20 mm. (c) 2022 SPIE and IS&T
What problem does this paper attempt to address?