Predicting credit risk on the basis of financial and non-financial variables and data mining

Sihem Khemakhem,Younes Boujelbene
DOI: https://doi.org/10.1108/raf-07-2017-0143
2018-08-13
Review of Accounting and Finance
Abstract:Purpose Data mining for predicting credit risk is a beneficial tool for financial institutions to evaluate the financial health of companies. However, the ubiquity of selecting parameters and the presence of unbalanced data sets is a very typical problem of this technique. This study aims to provide a new method for evaluating credit risk, taking into account not only financial and non-financial variables, but also the class imbalance. Design/methodology/approach The most significant financial and non-financial variables were determined to build a credit scoring model and identify the creditworthiness of companies. Moreover, the Synthetic Minority Oversampling Technique was used to solve the problem of class imbalance and improve the performance of the classifier. The artificial neural networks and decision trees were designed to predict default risk. Findings Results showed that profitability ratios, repayment capacity, solvency, duration of a credit report, guarantees, size of the company, loan number, ownership structure and the corporate banking relationship duration turned out to be the key factors in predicting default. Also, both algorithms were found to be highly sensitive to class imbalance. However, with balanced data, the decision trees displayed higher predictive accuracy for the assessment of credit risk than artificial neural networks. Originality/value Classification results depend on the appropriateness of data characteristics and the appropriate analysis algorithm for data sets. The selection of financial and non-financial variables, as well as the resolution of class imbalance allows companies to assess their credit risk successfully.
English Else
What problem does this paper attempt to address?