Predicting and Analysing Road Accident Severity with Machine Learning Models and Resampling

Xin Wang
DOI: https://doi.org/10.62051/p0xb2k56
2024-01-01
Abstract:Road accident threaten people’s life safety as well as wealth seriously, therefore predicting and analysing car accidents have great significance. This research first compared the performance of four machine learning models in analysing the severity of traffic accidents, including Random Forest, Naïve Bayes, Logistic Regression and Multi-Layer Perceptron. Naïve Bayes performs badly with a low accuracy while other three models have equal level of performance. Multi-Layer Perceptron performs better in minority classes than Random Forest and Logistic Regression. Among all features, Random Forest focus on geographical and time features, Logistic Regression and Multi-Layer Perceptron focus on driving features, including lighting and road surface. Then resampling method is applied to imbalanced data. After trained in resampled data, Random Forest and Logistic Regression perform better in minority classes with higher precision and recall. In summary, this research compares the performance of machine learning models in road accident severity predicting and analysing, and addresses the challenge of imbalanced data with resampling method.
What problem does this paper attempt to address?