A Non-autoregressive Decoding Model Based on Joint Classification for 3D Human Pose Regression.

Yuhang Guo,Dongmei Fu,Tao Yang
DOI: https://doi.org/10.1007/978-3-030-88007-1_36
2021-01-01
Abstract:Recently, the graph convolution networks (GCN) has been widely applied in 3D human pose regression and has showed encouraging performance. One limitation of this method is that it only models the semantic correlation between 2D joints feature, but ignores the variability of the semantic correlation between 2D joints feature. To address this limitation, we propose a non-autoregressive decoding model based on joint classification (JC-NARD), which realizes the 3D joints regression with the method of sequence analysis. The model splits joints into several joint sub-sequences according to the connection and semantic correlation, and then models the correlation between 3D-3D joints feature and 2D-3D joints feature by attention mechanism in each sub-sequence to establish 3D spatial constraint between joints. In order to verify the accuracy and generalization of the model, we combine our model with several 3D human pose regression networks, and the performance of the models are all improved by 1.2-4.5 mm.
What problem does this paper attempt to address?