Skip-attention Encoder–decoder Framework for Human Motion Prediction

Zhang Ruipeng,Shu Xiangbo,Yan Rui,Zhang Jiachao,Song Yan
DOI: https://doi.org/10.1007/s00530-021-00807-4
IF: 3.9
2021-01-01
Multimedia Systems
Abstract:Human motion prediction aims to automatically predict the future motion sequence based on an observed human motion sequence. In this paper, we propose a novel skip-attention encoder–decoder (SAED) framework to model human motion dependences in spatiotemporal space, by utilizing the encoder and decoder to encode the observed motions, and decode the predicted motions, respectively. Overall, this framework has two main insights. First, we design a new self-renewing ConvGRU as the unit of encoder and decoder to effectively capture temporal and spatial skeleton-motion dependencies. Second, we present a new skip-attention mechanism (SAM) to aggregate the motion information of all layers based on their importance. In experiments, quantitative and qualitative results on the Human3.6M and CMU motion capture datasets show the effectiveness of the proposed SAED compared with the related methods.
What problem does this paper attempt to address?