LG-Traj: LLM Guided Pedestrian Trajectory Prediction

Pranav Singh Chib,Pravendra Singh
2024-03-13
Abstract:Accurate pedestrian trajectory prediction is crucial for various applications, and it requires a deep understanding of pedestrian motion patterns in dynamic environments. However, existing pedestrian trajectory prediction methods still need more exploration to fully leverage these motion patterns. This paper investigates the possibilities of using Large Language Models (LLMs) to improve pedestrian trajectory prediction tasks by inducing motion cues. We introduce LG-Traj, a novel approach incorporating LLMs to generate motion cues present in pedestrian past/observed trajectories. Our approach also incorporates motion cues present in pedestrian future trajectories by clustering future trajectories of training data using a mixture of Gaussians. These motion cues, along with pedestrian coordinates, facilitate a better understanding of the underlying representation. Furthermore, we utilize singular value decomposition to augment the observed trajectories, incorporating them into the model learning process to further enhance representation learning. Our method employs a transformer-based architecture comprising a motion encoder to model motion patterns and a social decoder to capture social interactions among pedestrians. We demonstrate the effectiveness of our approach on popular pedestrian trajectory prediction benchmarks, namely ETH-UCY and SDD, and present various ablation experiments to validate our approach.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
This paper mainly discusses how to improve the accuracy of pedestrian trajectory prediction using large-scale language models (LLMs). Existing methods for pedestrian trajectory prediction still have room for improvement in understanding and utilizing motion patterns in pedestrian dynamic environments. The paper proposes a new method called LG-Traj, which combines motion cues generated by LLMs with the past and future trajectory information of pedestrians to better understand the underlying representations. LG-Traj works through the following steps: 1. Enhancing observed trajectories using singular value decomposition and integrating these enhanced trajectories into the model learning process. 2. Clustering future trajectories of training data using a mixture of Gaussian models to capture future motion cues. 3. Generating motion cues from past observed trajectories using LLM and converting these cues into vector form using word embeddings. 4. Utilizing a motion encoder to model motion patterns and a social decoder to capture social interactions between pedestrians for accurate future trajectory prediction. Experimental results demonstrate that LG-Traj performs well on popular pedestrian trajectory prediction benchmark datasets such as ETH-UCY and SDD, with improvements in both the average absolute error (ADE) and final absolute error (FDE) compared to existing methods, validating the effectiveness of this approach. Keywords: large-scale language model, trajectory prediction, pedestrian trajectory prediction, neural networks, and deep learning.