Hyper-STTN: Social Group-aware Spatial-Temporal Transformer Network for Human Trajectory Prediction with Hypergraph Reasoning

Weizheng Wang,Chaowei Wang,Baijian Yang,Guohua Chen,Byung-Cheol Min

2024-09-18

Abstract:Predicting crowded intents and trajectories is crucial in varouls real-world applications, including service robots and autonomous vehicles. Understanding environmental dynamics is challenging, not only due to the complexities of modeling pair-wise spatial and temporal interactions but also the diverse influence of group-wise interactions. To decode the comprehensive pair-wise and group-wise interactions in crowded scenarios, we introduce Hyper-STTN, a Hypergraph-based Spatial-Temporal Transformer Network for crowd trajectory prediction. In Hyper-STTN, crowded group-wise correlations are constructed using a set of multi-scale hypergraphs with varying group sizes, captured through random-walk robability-based hypergraph spectral convolution. Additionally, a spatial-temporal transformer is adapted to capture pedestrians' pair-wise latent interactions in spatial-temporal dimensions. These heterogeneous group-wise and pair-wise are then fused and aligned though a multimodal transformer network. Hyper-STTN outperformes other state-of-the-art baselines and ablation models on 5 real-world pedestrian motion datasets.

Computer Vision and Pattern Recognition,Machine Learning

What problem does this paper attempt to address?

The paper aims to address the complex social interaction problem in crowd trajectory prediction, particularly in predicting the intentions and trajectories of crowded people in social environments. Specifically, the paper attempts to solve the following key issues: 1. **Understanding Environmental Dynamics**: In crowded scenarios, it is essential to model not only the spatial and temporal interactions between individuals but also the complex influences between groups. This is crucial for understanding and predicting human behavior. 2. **High-Order Interaction Description**: Existing methods often lack effective descriptions of high-order interactions (such as interactions between groups) and the ability to reason about heterogeneous features. 3. **Subjective Intention Prediction**: Accurately predicting an individual's subjective intentions based on limited information remains challenging, especially in highly dynamic or complex scenarios. To address these issues, the authors propose a new framework called Hyper-STTN, which combines multi-scale hypergraphs and spatial-temporal transformer networks to capture pairwise interactions between individuals and interactions between groups. Additionally, a multi-modal transformer network is used to fuse these heterogeneous features. Experimental results show that Hyper-STTN outperforms existing state-of-the-art algorithms on multiple public crowd trajectory datasets.

Hyper-STTN: Social Group-aware Spatial-Temporal Transformer Network for Human Trajectory Prediction with Hypergraph Reasoning

Dynamic-learning Spatial-Temporal Transformer Network for Vehicular Trajectory Prediction at Urban Intersections

A Spatio-Temporal Transformer Network for Human Motion Prediction in Human-Robot Collaboration

Attention-aware Social Graph Transformer Networks for Stochastic Trajectory Prediction

Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction

PTP-STGCN: Pedestrian Trajectory Prediction Based on a Spatio-temporal Graph Convolutional Neural Network

Knowledge-aware Graph Transformer for Pedestrian Trajectory Prediction

Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction

Learning Sparse Interaction Graphs of Partially Detected Pedestrians for Trajectory Prediction

D-STGCN: Dynamic Pedestrian Trajectory Prediction Using Spatio-Temporal Graph Convolutional Networks

STIGCN: spatial–temporal interaction-aware graph convolution network for pedestrian trajectory prediction

Multimodal Pedestrian Trajectory Prediction Based on Relative Interactive Spatial-Temporal Graph

Pedestrian Trajectory Prediction via Spatial Interaction Transformer Network

Spatio-Temporal Context Graph Transformer Design for Map-Free Multi-Agent Trajectory Prediction

Dual-branch Spatio-Temporal Graph Neural Networks for Pedestrian Trajectory Prediction

STM-GCN: a spatiotemporal multi-graph convolutional network for pedestrian trajectory prediction

Spatio-Temporal Graph Dual-Attention Network for Multi-Agent Prediction and Tracking

Spatio-Temporal Interaction Aware and Trajectory Distribution Aware Graph Convolution Network for Pedestrian Multimodal Trajectory Prediction

Adaptive and Simultaneous Trajectory Prediction for Heterogeneous Agents Via Transferable Hierarchical Transformer Network

S2TNet: Spatio-Temporal Transformer Networks for Trajectory Prediction in Autonomous Driving

Multi-Modal Pedestrian Trajectory Prediction for Edge Agents Based on Spatial-Temporal Graph