Comparing Data Mining Models in Loan Default Prediction: A Framework and a Demonstration

Cuong Nguyen,Liang Chen
DOI: https://doi.org/10.25126/jitecs.202271352
2022-04-07
Journal of Information Technology and Computer Science
Abstract:In the banking sector, credit risk assessment is an important process to ensure that loans could be paid on time, and that banks could maintain their credit performance effectively. Despite restless business efforts allocated to credit scoring yearly, high percentage of loan defaulting remains a major issue. With the availability of tremendous banking data and advanced analytics tools, data mining algorithms can be applied to develop a platform of credit scoring, and to resolve the loan defaulting problem. This paper puts forward a framework to compare four classification algorithms, including logistic regression, decision tree, neural network, and Xgboost, using a public dataset. Confusion matrix and Monte Carlo simulation benchmarks are used to evaluate their performance. We find that the XGboost outperforms the other three traditional models. We also offer practial recommendation and future research.
What problem does this paper attempt to address?