Predicting Financial Literacy via Semi-supervised Learning

David Hason Rudd,Huan Huo,Guandong Xu

DOI: https://doi.org/10.1007/978-3-030-97546-3_25

2023-12-18

Abstract:Financial literacy (FL) represents a person's ability to turn assets into income, and understanding digital currencies has been added to the modern definition. FL can be predicted by exploiting unlabelled recorded data in financial networks via semi-supervised learning (SSL). Measuring and predicting FL has not been widely studied, resulting in limited understanding of customer financial engagement consequences. Previous studies have shown that low FL increases the risk of social harm. Therefore, it is important to accurately estimate FL to allocate specific intervention programs to less financially literate groups. This will not only increase company profitability, but will also reduce government spending. Some studies considered predicting FL in classification tasks, whereas others developed FL definitions and impacts. The current paper investigated mechanisms to learn customer FL level from their financial data using sampling by synthetic minority over-sampling techniques for regression with Gaussian noise (SMOGN). We propose the SMOGN-COREG model for semi-supervised regression, applying SMOGN to deal with unbalanced datasets and a nonparametric multi-learner co-regression (COREG) algorithm for labeling. We compared the SMOGN-COREG model with six well-known regressors on five datasets to evaluate the proposed models effectiveness on unbalanced and unlabelled financial data. Experimental results confirmed that the proposed method outperformed the comparator models for unbalanced and unlabelled financial data. Therefore, SMOGN-COREG is a step towards using unlabelled data to estimate FL level.

Machine Learning,Computational Engineering, Finance, and Science,Computers and Society,Econometrics

What problem does this paper attempt to address?

The paper attempts to address the problem of predicting customers' Financial Literacy (FL) through Semi-supervised Learning (SSL) methods. Specifically, the paper focuses on the following points: 1. **Importance of Financial Literacy**: Financial literacy refers to an individual's ability to convert assets into income, and in modern definitions, it also includes the understanding of digital currencies. Low financial literacy increases the risk of social harm, so accurately estimating financial literacy is crucial for allocating specific intervention programs. 2. **Limitations of Existing Research**: Currently, there is limited research on the measurement and prediction of financial literacy, leading to a limited understanding of the consequences of customers' financial participation. Most studies rely on questionnaires to assess financial literacy levels, but this method is time-consuming and costly. 3. **Advantages of Semi-supervised Learning**: Using a small amount of labeled data and a large amount of unlabeled data for learning can effectively reduce the cost and time of manual labeling while improving the model's performance. 4. **Handling Imbalanced Datasets**: Financial network data usually has a severe class imbalance problem, meaning that data samples for certain levels of financial literacy are scarce. The paper proposes a semi-supervised regression model (SMOGN-COREG) that combines Synthetic Minority Over-sampling Technique (SMOGN) and Co-training Regression (COREG) to handle imbalanced datasets. 5. **Experimental Validation**: The paper conducts experiments on 5 real-world financial datasets to validate the effectiveness of the proposed SMOGN-COREG model on imbalanced and unlabeled data. The results show that this model outperforms six other commonly used regression algorithms. In summary, the paper aims to effectively predict customers' financial literacy levels through semi-supervised learning methods, particularly the SMOGN-COREG model, thereby providing more targeted interventions for financial institutions and governments.

Predicting Financial Literacy via Semi-supervised Learning

Assessing Sensitivity of Machine Learning Predictions.A Novel Toolbox with an Application to Financial Literacy

Exploring the Readiness of Prominent Small Language Models for the Democratization of Financial Literacy

Churn Prediction via Multimodal Fusion Learning:Integrating Customer Financial Literacy, Voice, and Behavioral Data

SNFinLLM: Systematic and Nuanced Financial Domain Adaptation of Chinese Large Language Models

Estimating Financial Fraud through Transaction-Level Features and Machine Learning

A Privacy-Preserving Hybrid Federated Learning Framework for Financial Crime Detection

A Semi-supervised Graph Attentive Network for Financial Fraud Detection

Optimization of Personal Credit Evaluation Based on a Federated Deep Learning Model

Financial risk assessment to improve the accuracy of financial prediction in the internet financial industry using data analytics models

Leveraging Financial Social Media Data for Corporate Fraud Detection

Starlit: Privacy-Preserving Federated Learning to Enhance Financial Fraud Detection

Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs

A federated machine learning approach for order-level risk prediction in Supply Chain Financing

A Semi-Supervised Learning Financial News Classification Algorithm

A semi-supervised Anti-Fraud model based on integrated XGBoost and BiGRU with self-attention network: an application to internet loan fraud detection

Semi-Supervised Learning Classification Based on Generalized Additive Logistic Regression for Corporate Credit Anomaly Detection

Predicting Credit Risk for Unsecured Lending: A Machine Learning Approach

2SFGL: A Simple And Robust Protocol For Graph-Based Fraud Detection

Exploring the impact of financial literacy on predicting credit default among farmers: An analysis using a hybrid machine learning model