Accuracy Comparison between Five Machine Learning Algorithms for Financial Risk Evaluation

Haokun Dong,Rui Liu,Allan W. Tham
DOI: https://doi.org/10.3390/jrfm17020050
2024-01-29
Journal of Risk and Financial Management
Abstract:An accurate prediction of loan default is crucial in credit risk evaluation. A slight deviation from true accuracy can often cause financial losses to lending institutes. This study describes the non-parametric approach that compares five different machine learning classifiers combined with a focus on sufficiently large datasets. It presents the findings on various standard performance measures such as accuracy, precision, recall and F1 scores in addition to Receiver Operating Curve-Area Under Curve (ROC-AUC). In this study, various data pre-processing techniques including normalization and standardization, imputation of missing values and the handling of imbalanced data using SMOTE will be discussed and implemented. Also, the study examines the use of hyper-parameters in various classifiers. During the model construction phase, various pipelines feed data to the five machine learning classifiers, and the performance results obtained from the five machine learning classifiers are based on sampling with SMOTE or hyper-parameters versus without SMOTE and hyper-parameters. Each classifier is compared to another in terms of accuracy during training and prediction phase based on out-of-sample data. The 2 data sets used for this experiment contain 1000 and 30,000 observations, respectively, of which the training/testing ratio is 80:20. The comparative results show that random forest outperforms the other four classifiers both in training and actual prediction.
English Else
What problem does this paper attempt to address?