A trajectory data-driven approach for traffic risk prediction: incorporating variable interactions and pre-screening

Dan Wu,Jaeyoung Lee,Ye Li
DOI: https://doi.org/10.1080/12265934.2024.2346166
IF: 2.9
2024-05-07
International Journal of Urban Sciences
Abstract:Although historical crash data and trajectory data have been widely applied to crash and risk predictions, both types of data have their own limitations. As a solution, this study investigates the impact of the traffic flow parameters and their interaction terms on risk prediction performance, employing a variable pre-screening approach (i.e. Smoothly Clipped Absolute Deviation (SCAD)). A research framework is proposed for more efficient risk prediction, and a detailed case study is further conducted using the proposed approach. In the case study, real vehicle trajectory data from HighD are processed and used, which can be aggregated to extract both traffic flow parameters and corresponding risk data during a specific time interval. As for the risk detection, Time-to-Collision (TTC) index is utilized to identify risky conditions. For different lanes (i.e. inner, middle and outer lanes), the impact of variables, including interaction terms, on risk is explored using the SCAD-logistic models. Furthermore, machine learning methods are employed to compare the risk prediction performance before and after considering interaction terms, as well as before and after variable pre-screening. Finally, the superiority of the machine learning models after SCAD-based variable pre-screening is demonstrated. Results indicate that the interaction terms between traffic flow parameters have significant impacts on the traffic risk. Besides, considering interaction terms and variable pre-screening can improve risk prediction accuracy. Furthermore, the proposed models outperform Random Forest (RF) in terms of predicting traffic risk, achieving a maximum 21.24% accuracy improvement and reducing computational time by up to 31.51%. Findings of this study are expected to contribute to the high-precision prediction of real-time risk in the future.
environmental studies,urban studies
What problem does this paper attempt to address?