KSOF: Leveraging kinematics and spatio-temporal optimal fusion for human motion prediction

Rui Ding,KeHua Qu,Jin Tang
DOI: https://doi.org/10.1016/j.patcog.2024.111206
IF: 8
2024-12-03
Pattern Recognition
Abstract:Ignoring the meaningful kinematics law, which generates improbable or impractical predictions, is one of the obstacles to human motion prediction. Current methods attempt to tackle this problem by taking simple kinematics information as auxiliary features to improve predictions. However, it remains challenging to utilize human prior knowledge deeply, such as the trajectory formed by the same joint should be smooth and continuous in this task. In this paper, we advocate explicitly describing kinematics information via velocity and acceleration by proposing a novel loss called joint point smoothness (JPS) loss, which calculates the acceleration of joints to smooth the sudden change in joint velocity. In addition, capturing spatio-temporal dependencies to make feature representations more informative is also one of the obstacles in this task. Therefore, we propose a dual-path network (KSOF) that models the temporal and spatial dependencies from kinematic temporal convolutional network (K-TCN) and spatial graph convolutional networks (S-GCN), respectively. Moreover, we propose a novel multi-scale fusion module named spatio-temporal optimal fusion (SOF) to enhance extraction of the essential correlation and important features at different scales from spatio-temporal coupling features. We evaluate our approach on three standard benchmark datasets, including Human3.6M, CMU-Mocap, and 3DPW datasets. For both short-term and long-term predictions, our method achieves outstanding performance on all these datasets. The code is available at https://github.com/qukehua/KSOF .
computer science, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?