Principal Component Analysis and Factor Analysis for Feature Selection in Credit Rating

Shenghuan Yang,lonut Florescu,Md Tariqul Islam
DOI: https://doi.org/10.48550/arXiv.2011.09137
2020-11-18
Statistical Finance
Abstract:The credit rating is an evaluation of a company's credit risk that values the ability to pay back the debt and predict the likelihood of the debtor defaulting. There are various features influencing credit rating. Therefore, it is essential to select substantive features to explore the main reason for credit rating change. To address this issue, this paper exploited Principal Component Analysis and Factor Analysis as feature selection algorithms to select important features, summarized the similar features together, and obtained a minimum set of features for four sectors, Financial Sector, Energy Sector, Health Care Sector, Consumer Discretionary Sector. This paper used two data sets, Financial Ratio and Balance Sheet, with two mappings, Detailed Mapping, and Coarse Mapping, converting the target variable(credit rating) into categorical variable. To test the accuracy of credit rating prediction, Random Forest Classifier was used to test and train feature sets. The results showed that the accuracy of Financial Ratio feature sets was higher than that of Balance Sheet feature sets. In addition, Factor Analysis can reduce the number of features significantly to obtain almost the same accuracy that can decrease dramatically the time spent on analyzing data; we also summarized seven dominant factors and ten dominant factors affecting credit rating change in Financial Ratio and Balance Sheet by utilizing Factor Analysis, respectively, which can explain the reason of credit rating change better.
What problem does this paper attempt to address?