Gradient multi-foci networks for 3D skeleton-based human motion prediction

Junyu Shi,Jianqi Zhong,Zhiquan He,Wenming Cao
DOI: https://doi.org/10.1007/s00521-024-09817-5
2024-05-11
Neural Computing and Applications
Abstract:To achieve 3D skeleton-based human motion prediction, attention-based methods have encouraged performance due to the observation that attention parts in the past state influence future actions. The model can identify the most relevant information for motion prediction by introducing an attention mechanism. However, existing methods tend to address the general allocation of attention without further exploiting the precision to pinpoint the exact location of the most relevant information. This oversight subsequently curtails the potential of prediction performance. To solve this problem, we propose a novel Gradient Multi-Foci Network (GMFnet), which leverages two-stage foci: spectral focus and spatial focus, to find the most pertinent information in a manner that emulates natural cognitive processes. The core idea of the proposed GMFnet is based on two aspects: spectral focus to model the repeatability of the observation action sequence by deploying an attention-based Related Sequences Directing Block (RSDB), spatial focus to capture the most valuable parts between motion joints by using Attention Feature Computational Unit (AFCU). Extensive experiments are conducted to reveal that GMFnet can capture the precision to pinpoint the exact attention location, thus enhancing the prediction performance. The proposed GMFnet outperforms state-of-the-art methods by 10.7 and 7.4% of MPJPEs for short-term and long-term prediction in Human 3.6M and by 14.3 and 4.8% of MPJPEs for short-term and long-term forecast in CMU-Mocap. Moreover, GMFnet outperforms even more in short term by 24.9% in AMASS. The code is available at https://github.com/JunyuShi02/GMFNet.
computer science, artificial intelligence
What problem does this paper attempt to address?