Geometric algebra-based multiscale encoder-decoder networks for 3D motion prediction

Jianqi Zhong,Wenming Cao
DOI: https://doi.org/10.1007/s10489-023-04908-7
IF: 5.3
2023-09-02
Applied Intelligence
Abstract:3D human motion prediction is one of the essential and challenging problems in computer vision, which has attracted extensive research attention in the past decades. Many previous methods sought to predict the motion state of the next moment using the traditional recurrent neural network in Euclidean space. However, most methods did not explicitly exploit the relationships or constraints between different body components, which carry crucial information for motion prediction. In addition, human motion representation in Euclidean space has high distortion and shows a weak semantic expression when using deep learning models. Based on these observations, we propose a novel Geometric Algebra-based Multiscale Encoder-Decoder network (GAMEDnet) to predict the future 3D poses. In the encoder, the core module is a novel multiscale Geometric Algebra-based multiscale feature extractor(GA-MFE) , which extracts motion features given the multiscale human motion graph. In the decoder, we propose a novel GA-Graph-based Gated Recurrent Unit (GAG-GRU) to sequentially produce predictions. Extensive experiments are conducted to show that the proposed GAMEDnet outperforms state-of-the-art methods in both short and long-term motion prediction on the datasets of Human 3.6M, CMU Mocap.
computer science, artificial intelligence
What problem does this paper attempt to address?