A Data Slicing Method to Improve Machine Learning Model Accuracy in Bankruptcy Prediction

Ziyuan Ye
DOI: https://doi.org/10.1145/3480001.3480008
2021-07-23
Abstract:High-accuracy bankruptcy prediction has been important to investors and corporate finance officers for decades. With bankruptcy data in China and Poland given, this paper is an exploratory study attempting to aid feature engineering in bankruptcy predictions through a new exploratory method we call “Data Slicing.” Our data slicing analysis relies on making predictions on carefully selected and sliced financial datasets and measuring each sliced dataset's prediction accuracy. According to the findings in this research, the most related metric and the best variable to slice on to get a predictable sliced dataset turn out to be “Solvency Ratio” both in Chinese and Polish data. Simultaneously, using two different sliced datasets, the accuracy of machine learning and deep learning methods is improved. Support Vector Machine, Neural Networks and Random Forest methods are suggested to use in bankruptcy detection for higher accuracy. In summary, investors and other risk management officers are highly recommended to pay attention to firm's ability to pay debts, especially in their valuation attempts and forecasts.
What problem does this paper attempt to address?