Multi-view GCN for loan default risk prediction

Zihao Li,Yakun Chen,Xianzhi Wang,Lina Yao,Guandong Xu

DOI: https://doi.org/10.1007/s00521-024-09695-x

2024-04-19

Neural Computing and Applications

Abstract:Abstract As a significant application of machine learning in financial scenarios, loan default risk prediction aims to evaluate the client’s default probability. However, most existing deep learning solutions treat each application as an independent individual, neglecting the explicit connections among different application records. Besides, these attempts suffer from the problem of missing data and imbalanced distribution (i.e., the default records are small samples against all the applications). We believe similar records could provide some auxiliary signals, which are of critical importance to alleviate the data missing issue and facilitate data argumentation. To this end, we propose multi-view loan application graphs, dubbed MLAGs. By evaluating the similarity between the records, a loan application graph can be constructed. Furthermore, we arrange different similarity thresholds to organize various graph structures for multi-graph constructions; thus, a variety of representations can be generated via information propagation and aggregation for small sample argumentation. Consequently, the imbalanced data distribution and missing values issues can be alleviated effectively. We conduct experiments on three public datasets from real-world home credit and P2P lending platforms, which show that MGCN outperforms both conventional and deep learning models. Ablation studies also illustrated the validity of each module design.

computer science, artificial intelligence

What problem does this paper attempt to address?

This paper proposes a solution to the problem of predicting loan default risk. Existing deep learning methods typically treat each loan application as an independent entity when processing loan applications, ignoring explicit connections between different application records. In addition, these methods also face the issues of data missing and imbalanced distributions (i.e., default records are small samples). The paper argues that similar records can provide auxiliary signals that help alleviate data missing problems and enhance data argumentation. To this end, the paper introduces Multi-View Loan Application Graphs (MLAGs), which construct graph structures by computing similarities between records and organize multiple graphs with different similarity thresholds to generate diversified representations for enhancing small-sample argumentation. This approach effectively mitigates data imbalance and missing values issues. Experiments conducted on three real-world public datasets from consumer credit and P2P lending platforms demonstrate that MGCN outperforms traditional and deep learning models, and the effectiveness of the module design is validated through ablation studies.

Multi-view GCN for loan default risk prediction

Attention-based Dynamic Multilayer Graph Neural Networks for Loan Default Prediction

Loan Fraud Users Detection in Online Lending Leveraging Multiple Data Views

Loan Default Analysis with Multiplex Graph Learning

A Deep Learning-based Model for P2P Microloan Default Risk Prediction

Applying Hybrid Graph Neural Networks to Strengthen Credit Risk Analysis

Machine learning application in online lending risk prediction

Application of Machine Learning in Loan Default Prediction

Prediction of loan default based on multi-model fusion

Financial Default Prediction via Motif-preserving Graph Neural Network with Curriculum Learning

Loan Default Prediction in Microfinance Group Lending with Machine Learning

Ensemble-Based Machine Learning Algorithm for Loan Default Risk Prediction

Temporal-Aware Graph Neural Network for Credit Risk Prediction.

Loan default prediction of Chinese P2P market: a machine learning methodology

A two-stage hybrid credit risk prediction model based on XGBoost and graph-based deep neural network

Study on a prediction of P2P network loan default based on the machine learning LightGBM and XGboost algorithms according to different high dimensional data cleaning

Prediction Defaults for Networked-guarantee Loans

A deep learning model-based approach to financial risk assessment and prediction

CDGAT: a graph attention network method for credit card defaulters prediction

Prediction and Analysis of Financial Default Loan Behavior Based on Machine Learning Model

Applying Machine Learning Techniques To Maximize The Performance of Loan Default Prediction