Machine learning based car accident risk prediction for usage-based insurance
Silvia Strada,Emanuele Costantini,Simone Formentin,Sergio M. Savaresi
DOI: https://doi.org/10.3233/ida-230971
IF: 1.7
2024-07-20
Intelligent Data Analysis
Abstract:The Usage-Based Insurance paradigm, which is receiving a lot of attention in recent years, envisages computing the car policy premium based on accident risk probability, evaluated observing the past driving history and habits. However, Usage-Based Insurance strategies are usually based on simple empirical decision rules built on travelled distance. The development of intelligent systems for smart risk prediction using the stored overall driving behaviour, without the need of other insurance or socio-demographic information, is still an open challenge. This work aims at exploring a comprehensive machine learning-based approach solely based on driving-related data of private vehicles. The anonymized dataset employed in this study is provided by the telematics company UnipolTech, and contains space/time densely measured data related to trips of almost 100000 vehicles uniformly spread on the Italian territory, recorded every 2 km by on-board telematics fix devices (black boxes), from February 2018 to February 2020. An innovative feature engineering process is proposed, with the aim of uncovering novel informative quantities able to disclose complex aspects of driving behaviour. Recent and powerful learning techniques are explored to develop advanced predictive models, able to provide a reliable accident probability for each vehicle, automatically managing the critical imbalance intrinsically peculiar this kind of datasets.
computer science, artificial intelligence