The Bank Conducts Credit Evaluation on Credit Holders

Ye Zeng
DOI: https://doi.org/10.54097/bjabrh26
2024-08-15
Abstract:The study started with data preprocessing, identifying and removing 88 potential outliers in contact duration based on the 3σ rule. Descriptive stats showed a customer base primarily aged 30-50 with a low median deposit balance (440), brief (4-minute) calls, and infrequent interactions. Graphical analysis via bar and pie charts illuminated key demographics: managerial and blue-collar workers dominated, most customers were married, had a low default rate, and approximately 15.3% had active loans. Most customers didn't use deposit products (89.4%). Scatter plots revealed significant correlations among continuous variables, with 'duration' having a moderate positive link to the dependent variable. Multivariate categorical variables underwent variance tests and multiple comparisons, revealing differences in deposit subscriptions by occupation and education levels. Binary categorical variables were assessed using T-tests. Logistic regression models were trained on five randomly divided subsets, leveraging SMOTE due to class imbalance. The model performed well (accuracy 0.8935, F1 score 0.7107), yet lower recall suggested scope for improvement. The study furnishes insights into customer behavior and proposes avenues for refining credit risk assessment in banking.
What problem does this paper attempt to address?