Analyzing Factors and Interaction Terms Affecting Urban Fatal Crash Types Based on a Hybrid Framework of Econometric Model and Machine Learning Approaches

Zongpin Hu,Qin Shi,Yikai Chen,Quan Yuan,Zhengbin Tao,Yujie Bian,Md Mazharul Haque
DOI: https://doi.org/10.1080/13588265.2022.2130621
2021-01-01
SSRN Electronic Journal
Abstract:The discrete outcome model is an important method for analyzing the factors affecting crash outcomes. However, the lack of effective approaches for discretizing continuous variables and mining interaction terms are two important problems confronted by such models. To address the above issues, this paper proposes a hybrid approach combining machine learning and econometric modelling to investigate fatal crash types in Shenzhen, China. First, the fatal crash data were collected from 2014 to 2016 in Shenzhen. Second, the minimum description length principle (MDLP), an outstanding representative of supervised discretization algorithms, was used for the discretization of continuous variables in the data. This algorithm selects the proper cut-point through the minimization of the entropy for the given interval. Subsequently, the feature subset selection algorithm based on association rule mining (FEAST), which has advantages over other interaction-mining algorithms in terms of structure freedom and the global search capability, was employed to mine the interaction effects between variables. Finally, the discretized continuous variables and the interaction terms were incorporated into the random parameters logit (RPL) model. Results reveal that the goodness of fit of the MDLP-FEAST-RPL model proposed in this paper is significantly better than that of the equal width discretization (EWD)-RPL, MDLP-RPL, and EWD-FEAST-RPL models. In addition, a total of eleven factors and interaction terms are associated with urban fatal crash types. These findings will facilitate the development of cost-effective policies or countermeasures for targeted crash types in large cities of developing countries.
What problem does this paper attempt to address?