3D Skeleton-Based Human Motion Prediction Using Spatial–temporal Graph Convolutional Network

Jianying Huang,Hoon Kang
DOI: https://doi.org/10.1007/s13735-024-00341-9
2024-01-01
International Journal of Multimedia Information Retrieval
Abstract:3D human motion prediction; predicting future human poses in the basis of historically observed motion sequences, is a core task in computer vision. Thus far, it has been successfully applied to both autonomous driving and human–robot interaction. Previous research work has usually employed Recurrent Neural Networks (RNNs)-based models to predict future human poses. However, as previous works have amply demonstrated, RNN-based prediction models suffer from unrealistic and discontinuous problems in human motion prediction due to the accumulation of prediction errors. To address this, we propose a feed-forward, 3D skeleton-based model for human motion prediction. This model, the Spatial–Temporal Graph Convolutional Network (ST-GCN) model, automatically learns the spatial and temporal patterns of human motion from input sequences. This model overcomes the limitations of previous research approaches. Specifically, our ST-GCN model is based on an encoder-decoder architecture. The encoder consists of 5 ST-GCN modules, with each ST-GCN module consisting of a spatial GCN layer and a 2D convolution-based TCN layer, which facilitate the encoding of the spatio-temporal dynamics of human motion. Subsequently, the decoder, consisting of 5 TCN layers, exploits the encoded spatio-temporal representation of human motion to predict future human pose. We leveraged the ST-GCN model to perform extensive experiments on various large-scale human activity 3D pose datasets (Human3.6 M, AMASS, 3DPW) while adopting MPJPE (Mean Per Joint Position Error) as the evaluation metric. The experimental results demonstrate that our ST-GCN model outperforms the baseline models in both short-term (< 400 ms) and long-term (> 400 ms) predictions, thus yielding the best prediction results.
What problem does this paper attempt to address?