Spatial statistics and random forest approaches for traffic crash hot spot identification and prediction

Eskindir Ayele Atumo,Tuo Fang,Xinguo Jiang
DOI: https://doi.org/10.1080/17457300.2021.1983844
2021-10-06
International Journal of Injury Control and Safety Promotion
Abstract:Crash hot spot identification and prediction using spatial statistics and random forest methods on the interstate of Michigan are evaluated. The Getis-Ord statistics are adopted to identify hot spots using location, frequency, and equivalent property damage only weights computed from the cost and severity of crashes. In the random forest approach, data patterns between 2010 and 2017 are determined to predict hot spots of crashes in 2018. Accordingly, the results indicate that: (i) interstate routes have witnessed 13,089 crashes on significant hot spots, 7,413 on cold spots, and the rest in other locations; (ii) random forest shows 76.7% and 74% accuracy for validation and prediction, respectively. The performance of the model is further affirmed with precision, recall, and F-scores of 75%, 74%, and 70%, respectively; and (iii) clustering of the crashes exhibits spatial dependence of high and low equivalent property damage only crashes. The practical significance of the approach is highlighted in the identification and prediction of hot spots.
public, environmental & occupational health
What problem does this paper attempt to address?