Variable Selection in Credit Risk Models for Chinese Listed Companies

胡心瀚,叶五一,缪柏其
DOI: https://doi.org/10.13860/j.cnki.sltj.2012.06.017
2012-01-01
Abstract:The character of high dimension and high correlation of the credit risk data set has been considered as a serious effect on the model accuracy. Considering the demand of credit risk model and existing variable selection algorithm, this paper designs a new non-parametrical method for the variable selection, with which the noise and collinear variables are excluded from the original data set. This article also proposes a "forward and backward" algorithm to find the optimal solution for the new variable selection method. In this paper the Logistic regression model is used as an example in the empirical analysis. The result shows that comparing with other variable selection methods, the proposed method can not only reduce the data dimension and remove the collinear variables but also make the model more precise and reliable.
What problem does this paper attempt to address?