A Machine Learning-based System for Financial Fraud Detection

João Paulo A. Andrade,Leonardo S. Paulucio,Thiago M. Paixão,Rodrigo F. Berriel,Teresa Cristina Janes Carneiro,Raphael V. Carneiro,Alberto F. De Souza,Claudine Badue,Thiago Oliveira-Santos
DOI: https://doi.org/10.5753/eniac.2021.18250
2021-11-29
Abstract:Companies created for money-laundering or as a means for taxevasion are harmful to the country's economy and society. This problem is usually tackled by governmental agencies by having officials to pore over companies' financial data and to single out those that exhibit fraudulent behavior. Such work tends to be slow-paced and tedious. This paper proposes a machine learning-based system capable of classifying whether a company is likely to be involved in fraud or not. Based on financial and tax data from various companies, four different classifiers – k-Nearest Neighbors, Random Forest, Support Vector Machine (SVM), and a Neural Network – were trained and then used to indicate fraud. The best-performing model achieved a macro-averaged F1-score of 92.98% with the Random Forest.
What problem does this paper attempt to address?