Credit Scoring Using Ensemble Classification Based on Variable Weighting Clustering

Haiyang Ding,Peng Zhang,Tun Lu,Hansu Gu,Ning Gu
DOI: https://doi.org/10.1109/cscwd.2017.8066746
2017-01-01
Abstract:Credit scoring plays an important role in financial institutions and debt based crowdfunding platforms as well as peer to peer lending platforms. In the last few years, adopting ensemble methods for credit scoring has become much more popular. However, the performance of ensemble methods is easily affected by the parameter settings and the number of base classifiers. Ensemble classification based on clustering is able to determine the best number of base classifiers automatically by clustering and find optimal parameter settings for base classifiers by training them individually on the training subsets combined by clusters. By this way, the adverse effect of manually setting the parameters and the number of base classifiers can be avoided. However, the different contributions of attributes to the distance metrics are not considered in conventional clustering methods, which may decrease the performance of ensemble classifiers based on them. Moreover, unbalanced training subsets decrease the performance of base classifiers, which results in the bad performance of ensemble classifiers. In our approach, to address the above problems, we first assign different weights to different variables when measuring the distance between two instances in the clustering step, and then adopt Subagging resampling method to deal with unbalanced training subsets in the training process. Experimental results show that our approach can improve the performance of the ensemble classifier.
What problem does this paper attempt to address?