Efficient Human Motion Prediction Using Temporal Convolutional Generative Adversarial Network

Qiongjie Cui,Huaijiang Sun,Yue Kong,Xiaoqian Zhang,Yanmeng Li
DOI: https://doi.org/10.1016/j.ins.2020.08.123
IF: 8.1
2020-01-01
Information Sciences
Abstract:Human motion prediction from its historical poses is an essential task in computer vision; it is successfully applied for human-machine interaction and intelligent driving. Recently, significant progress has been made with variants of RNNs or LSTMs. Despite alleviating the vanishing gradient problem, the chain RNN often leads to deformities and convergence to the mean pose because of its low ability to capture long-term dependencies. To address these problems, in this paper, we propose a temporal convolutional generative adversarial network (TCGAN) to forecast high-fidelity future poses. The TCGAN uses hierarchical temporal convolution to model the long-term patterns of human motion effectively. In contrast to RNNs, the hierarchical convolution structure has recently proved to be a more efficient method for sequence-to-sequence learning in computational complexity, the number of model parameters, and parallelism. Besides, instead of traditional GANs, spectral normalization (SN) is embedded in the model to alleviate mode collapse. Compared with typical recurrent methods, the proposed model is feedforward and can produce the future poses in real-time. Extensive experiments on various human activity analysis benchmarks (i.e., H3.6M, CMU, and 3DPW MoCap) demonstrate that the model consistently outperforms the state-of-the-art methods in terms of accuracy and visualization for short-term and long-term predictions. (C) 2020 Elsevier Inc. All rights reserved.
What problem does this paper attempt to address?