Abstract:Accurate predictions of future pedestrian trajectory could prevent a considerable number of traffic injuries and improve pedestrian safety. It involves multiple sources of information and real-time interactions, e.g., vehicle speed and ego-motion, pedestrian intention and historical locations. Existing methods directly apply a simple concatenation operation to combine multiple cues while their dynamics over time are less studied. In this paper, we propose a novel Long Short-Term Memory (LSTM), namely, to incorporate multiple sources of information from pedestrians and vehicles adaptively. Different from LSTM, our considers mutual interactions and explores intrinsic relations among multiple cues. First, we introduce extra memory cells to improve the transferability of LSTMs in modeling future variations. These extra memory cells include a speed cell to explicitly model vehicle speed dynamics, an intention cell to dynamically analyze pedestrian crossing intentions and a correlation cell to exploit correlations among temporal frames. These three individual cells uncover the future movement of vehicles, pedestrians and global scenes. Second, we propose a gated shifting operation to learn the movement of pedestrians. The intention of crossing the road or not would significantly affect pedestrian's spatial locations. To this end, global scene dynamics and pedestrian intention information are leveraged to model the spatial shifts. Third, we integrate the speed variations to the output gate and dynamically reweight the output channels via the scaling of vehicle speed. The movement of the vehicle would alter the scale of the predicted pedestrian bounding box: as the vehicle gets closer to the pedestrian, the bounding box is enlarging. Our rescaling process captures the relative movement and updates the size of pedestrian bounding boxes accordingly. Experiments conducted on three pedestrian trajectory forecasting benchmarks show that our a-hieves state-of-the-art performance.

A multimodal stepwise-coordinating framework for pedestrian trajectory prediction

Crossmodal Transformer Based Generative Framework for Pedestrian Trajectory Prediction

MSTCNN: multi-modal spatio-temporal convolutional neural network for pedestrian trajectory prediction

Multimodal Forward Generation Transformer Network for Inconspicuous Pedestrian Trajectory Prediction

Spatio-Temporal Interaction Aware and Trajectory Distribution Aware Graph Convolution Network for Pedestrian Multimodal Trajectory Prediction

A Multi-Stage Goal-Driven Network for Pedestrian Trajectory Prediction

Hierarchical Multi-Supervision Multi-Interaction Graph Attention Network for Multi-Camera Pedestrian Trajectory Prediction

A Unified Environmental Network for Pedestrian Trajectory Prediction

PTP-STGCN: Pedestrian Trajectory Prediction Based on a Spatio-temporal Graph Convolutional Neural Network

Map-Adaptive Multimodal Trajectory Prediction via Intention-Aware Unimodal Trajectory Predictors

Pedestrian Trajectory Prediction Combining Probabilistic Reasoning and Sequence Learning

Holistic LSTM for Pedestrian Trajectory Prediction

BiTraP: Bi-Directional Pedestrian Trajectory Prediction With Multi-Modal Goal Estimation

SGCN:Sparse Graph Convolution Network for Pedestrian Trajectory Prediction

Action-Aware Encoder-Decoder Network for Pedestrian Trajectory Prediction

Multi-information-based convolutional neural network with attention mechanism for pedestrian trajectory prediction

Context-aware Pedestrian Trajectory Prediction with Multimodal Transformer

A multi-modal vehicle trajectory prediction framework via conditional diffusion model: A coarse-to-fine approach

Multi-Stream Representation Learning for Pedestrian Trajectory Prediction

Modeling social interaction and intention for pedestrian trajectory prediction

Multi-Modal Pedestrian Trajectory Prediction for Edge Agents Based on Spatial-Temporal Graph