Enhancing User' s Income Estimation with Super-App Alternative Data

Gabriel Suarez,Juan Raful,Maria A. Luque,Carlos F. Valencia,Alejandro Correa-Bahnsen
DOI: https://doi.org/10.48550/arXiv.2104.05831
2021-08-03
Abstract:This paper presents the advantages of alternative data from Super-Apps to enhance user' s income estimation models. It compares the performance of these alternative data sources with the performance of industry-accepted bureau income estimators that takes into account only financial system information; successfully showing that the alternative data manage to capture information that bureau income estimators do not. By implementing the TreeSHAP method for Stochastic Gradient Boosting Interpretation, this paper highlights which of the customer' s behavioral and transactional patterns within a Super-App have a stronger predictive power when estimating user' s income. Ultimately, this paper shows the incentive for financial institutions to seek to incorporate alternative data into constructing their risk profiles.
Machine Learning,Risk Management
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is: how to use alternative data in super - apps to improve the accuracy of user income estimation models. Specifically, the paper aims to answer the following three research questions: 1. **Can the use of alternative data sources from super - apps significantly improve the statistical performance of income estimation?** - The paper explores this issue by comparing the performance of alternative data with the traditional industry - recognized income estimation methods (based only on financial system information). 2. **What user behavior patterns are revealed by these super - app features and how are they different from traditional financial information resources?** - By analyzing users' interaction and transaction behaviors in super - apps, the paper has identified some new behavior patterns that may have stronger predictive power for income estimation. 3. **Which behavior patterns show stronger predictive power?** - Using the TreeSHAP method to interpret the stochastic gradient boosting model, the paper determines which user behavior patterns have stronger predictive power for income estimation. ### Main Conclusions The research results of the paper show that using alternative data in super - apps can significantly improve the accuracy of income estimation, which is better than the traditional income estimation methods that rely only on financial system information. Specifically: - **The performance of alternative data is better than that of traditional methods**: The experimental results show that the model combined with super - app alternative data performs better in terms of mean absolute percentage error (MAPE). - **Alternative data captures information not covered by traditional methods**: The data of super - apps can capture various aspects of information such as users' consumption habits and payment behaviors, which have important predictive value for income estimation. - **Predictive power of key behavior patterns**: For example, behavior patterns such as the frequency of users ordering in expensive restaurants, high consumption amounts, and using premium credit cards have strong predictive power for income estimation; while users who prefer discounts tend to be predicted as low - income groups. ### Significance This research provides new ideas for financial institutions. That is, by integrating alternative data in super - apps, they can more accurately assess users' income levels, thereby better managing credit risks. This not only helps financial institutions more accurately assess loan amounts, but also enables users without traditional financial records to obtain financial services, promoting financial inclusion.