Adaptive Spatial-Temporal Graph-Mixer for Human Motion Prediction

Shubo Yang,Haolun Li,Chi-Man Pun,Chun Du,Hao Gao
DOI: https://doi.org/10.1109/lsp.2024.3392686
2024-05-07
IEEE Signal Processing Letters
Abstract:The Graph Convolutional Network (GCN) has recently achieved promising performance in human motion prediction by modeling the nodes and edges of the human skeleton. However, most previous methods still suffer from two unaddressed drawbacks. First, in the inference stage, their graph topologies are static and fixed, resulting in dependencies between nodes that cannot be dynamically adjusted for different actions. Second, the implicit relationships between pose sequences are ignored, which makes the prior advantages of the graph structure invalid in temporal feature fusion. To address these limitations, we propose an adaptive spatial-temporal graph-mixer (GraphMixer) for human motion prediction, which consists of a series of fully separated spatial-temporal graph convolution structures. In spatial GCN, we construct an additional adaptive skeleton graph to capture the node features of action-specific poses. In temporal GCN, we introduce a variety of graph topologies to enhance feature fusion between pose sequences. Comparing state-of-the-art algorithms on the Human 3.6 M and the 3 DPW datasets and ablation studies shows that our GraphMixer and the proposed multiple graph topologies are effective and critical.
engineering, electrical & electronic
What problem does this paper attempt to address?