A multilayer human motion prediction perceptron by aggregating repetitive motion
Lei Geng,Wenzhu Yang,Yanyan Jiao,Shuang Zeng,Xinting Chen
DOI: https://doi.org/10.1007/s00138-023-01447-6
IF: 2.983
2023-09-15
Machine Vision and Applications
Abstract:Human motion prediction aims to forecast future human poses given a historical motion. Current state-of-the-art approaches rely on deep learning architectures of arbitrary complexity, such as Recurrent Neural Networks (RNN), Graph Convolutional Networks (GCN), and typically requires multiple training stages and more parameters. In addition, existing learning-based methods fail to model the observation that human motion tends to repeat itself. In summary, to address the problem of the existing methods neglecting the repetitive nature of human motion, we first introduced a Multi-level Attention Mechanism (MAM) that explicitly leverages this observation to find relevant historical information for predicting future motion. Instead of modeling frame-wise attention via pose similarity, the motion attention was extracted to capture the similarity between the current motion context and the historical motion sub-sequences. In this context, the use of different types of attention, computed at joint, body part, and full pose levels was studied. Furthermore, to address the complexity of existing algorithms based on deep learning architectures, a Fully connected Transpose MLP (FTMLP) model was introduced. By combining a MLP network with a fully connected and transposed layer to process the aggregated relevant past movements, the patterns of motion from the long-term history can be quickly and efficiently used to predict the future poses. The experimental results on standard motion prediction benchmark datasets Human3.6 M and CMU motion capture dataset show that our model is able to make accurate short- and long-term predictions.
computer science, cybernetics, artificial intelligence,engineering, electrical & electronic