A Comparative Study on the Effectiveness of Machine Learning Methods in Auto Insurance Fraud Identification

CHEN Kai,Li Bin-jie
DOI: https://doi.org/10.13497/j.cnki.is.2022.12.006
2022-01-01
Abstract:The magnitude of China′s auto insurance market has induced a large amount of auto insurance frauds.However, the traditional auto insurance fraud identification methods are not effective.This paper uses machine learning methods and makes an empirical analysis based on four data sets to compare the prediction performance and robustness of six mainstream machine learning methods on auto insurance fraud detection.We split all four original data sets into training set and test set.The training set is used to build the machine learning model, and the test set is used to evaluate the effect of the machine learning model.Together, we evaluate the prediction performance of each machine learning method and the robustness of the prediction performance.Firstly, we use SMOTE sampling method to generate new data, in order to balance the number of fraud samples and non-fraud samples in the training set.We then use the 10-fold cross validation method to select the best parameter combination to determine the optimal adjustment parameters in machine learning.We use the Receiver Operating Characteristic Curve and the Area Under the Curve as the evaluation standard of the prediction effect of the model.Finally, we find the prediction performance and robustness of the stochastic forest model and extreme gradient lifting decision tree model are better.
What problem does this paper attempt to address?