Machine learning loss given default for corporate debt

Luke M. Olson,Min Qi,Xiaofei Zhang,Xinlei Zhao
DOI: https://doi.org/10.1016/j.jempfin.2021.08.009
IF: 3.025
2021-12-01
Journal of Empirical Finance
Abstract:We apply multiple machine learning (ML) methods to model loss given default (LGD) for corporate debt using a common dataset that is cross-sectional but collected over different time periods and shows much variation over time. We investigate the efficacy of three cross-validation (CV) schemes for hyper-parameter tuning and bootstrap aggregation (Bagging) in preventing out-of-time model performance deterioration. The three CV methods are shuffled K-fold, unshuffled K-fold and sequential blocked, which completely destroys, keeps some and completely retains the chronological order in the data, respectively. We find that it is important to keep the chronological order in the data when creating the training and testing samples, and the more the chronological order that can be retained, the more stable the out-of-time ML LGD model performance. By contrast, although bagging improves out-of-time fit in some cases, its effectiveness is rather marginal relative to that from the unshuffled K-fold and sequential blocked CV methods. Substantial uncertainty in relative out-of-time performance remains, however, thus ongoing model performance monitoring and benchmarking are still essential for sound model risk management for corporate LGD and other ML models.
economics,business, finance
What problem does this paper attempt to address?