Loan Default Prediction Based on Logistic Regression and XGBoost Modeling

Ouyang Yi
DOI: https://doi.org/10.1109/ICCECT60629.2024.10546207
2024-04-26
Abstract:The accurate prediction of loan default risk is of paramount importance in the financial sector. In this paper, we delve into the realm of predictive modeling by employing logistic regression and XGBoost algorithms to forecast loan default occurrences. Our methodology involves rigorous data preprocessing, including handling missing values and encoding categorical variables, followed by the training of logistic regression and XGBoost models on the refined dataset. Our experimental findings reveal that the XGBoost model surpasses logistic regression in predictive accuracy. Furthermore, through comprehensive feature importance analysis, we identify and elucidate the key determinants contributing to loan default prediction. The insights garnered from our study hold significant implications for financial institutions, offering valuable guidance in the assessment and mitigation of loan default risks, thereby fostering a more secure and stable lending environment.
Computer Science,Business
What problem does this paper attempt to address?