Bond Default Prediction with Text Embeddings, Undersampling and Deep Learning

Luke Jordan
DOI: https://doi.org/10.48550/arXiv.2110.07035
2021-10-14
Abstract:The special and important problems of default prediction for municipal bonds are addressed using a combination of text embeddings from a pre-trained transformer network, a fully connected neural network, and synthetic oversampling. The combination of these techniques provides significant improvement in performance over human estimates, linear models, and boosted ensemble models, on data with extreme imbalance. Less than 0.2% of municipal bonds default, but our technique predicts 9 out of 10 defaults at the time of issue, without using bond ratings, at a cost of false positives on less than 0.1% non-defaulting bonds. The results hold the promise of reducing the cost of capital for local public goods, which are vital for society, and bring techniques previously used in personal credit and public equities (or national fixed income), as well as the current generation of embedding techniques, to sub-sovereign credit decisions.
Machine Learning
What problem does this paper attempt to address?