Characterized Diffusion and Spatial-Temporal Interaction Network for Trajectory Prediction in Autonomous Driving

Haicheng Liao,Xuelin Li,Yongkang Li,Hanlin Kong,Chengyue Wang,Bonan Wang,Yanchen Guan,KaHou Tam,Zhenning Li,Chengzhong Xu
2024-05-03
Abstract:Trajectory prediction is a cornerstone in autonomous driving (AD), playing a critical role in enabling vehicles to navigate safely and efficiently in dynamic environments. To address this task, this paper presents a novel trajectory prediction model tailored for accuracy in the face of heterogeneous and uncertain traffic scenarios. At the heart of this model lies the Characterized Diffusion Module, an innovative module designed to simulate traffic scenarios with inherent uncertainty. This module enriches the predictive process by infusing it with detailed semantic information, thereby enhancing trajectory prediction accuracy. Complementing this, our Spatio-Temporal (ST) Interaction Module captures the nuanced effects of traffic scenarios on vehicle dynamics across both spatial and temporal dimensions with remarkable effectiveness. Demonstrated through exhaustive evaluations, our model sets a new standard in trajectory prediction, achieving state-of-the-art (SOTA) results on the Next Generation Simulation (NGSIM), Highway Drone (HighD), and Macao Connected Autonomous Driving (MoCAD) datasets across both short and extended temporal spans. This performance underscores the model's unparalleled adaptability and efficacy in navigating complex traffic scenarios, including highways, urban streets, and intersections.
Robotics
What problem does this paper attempt to address?
This paper aims to address the trajectory prediction problem in the field of autonomous driving (AD), especially to improve the accuracy of predictions in heterogeneous and uncertain traffic scenarios. The paper introduces a novel trajectory prediction model named CDSTraj, at the core of which is an innovative "Characterized Diffusion Module" designed to simulate uncertainties in traffic scenarios and enhance trajectory prediction accuracy by injecting detailed semantic information. Additionally, the paper introduces a Spatio-Temporal Interaction Module to capture the subtle influences of traffic scenarios on vehicle dynamics in both spatial and temporal dimensions. The CDSTraj model significantly improves trajectory prediction performance through the following three main contributions: 1. **Characterized Diffusion Module**: By iteratively mitigating uncertainties and dynamically simulating future traffic scenarios, this module increases the accuracy of trajectory predictions. It integrates complex situational features into the prediction process, allowing for a more nuanced understanding of potential movements. 2. **Spatio-Temporal Interaction Module**: Utilizes spatial-temporal attention mechanisms to precisely model and analyze complex interactions within traffic scenarios. This module has a three-stage architecture that effectively captures and processes information across space and time. 3. **Outstanding Experimental Results**: Through extensive experiments, the CDSTraj model has achieved state-of-the-art performance on public datasets such as Next Generation Simulation (NGSIM), Highway Drone (HighD), and Macao Connected Autonomous Driving (MoCAD). Particularly on the MoCAD dataset, its unique right-hand driving configuration and mandatory left-hand driving rules provide a new perspective for evaluation, highlighting the model's adaptability and accuracy in different driving scenarios. The paper also details the model's framework and methodology, including the feature diffusion process, the implementation of spatio-temporal interaction, and the design of the decoder. It demonstrates the superiority of CDSTraj over existing state-of-the-art methods through the experimental section, proving its effectiveness in both short and long-term prediction ranges. Finally, the paper proposes future research directions, including applying the model to pedestrian trajectory prediction and further exploring the integration of spatial and temporal information.