Intelligent Prediction Mathematical Model of Industrial Financial Fraud Based on Data Mining

Xiuqin Geng,Dawei Yang
DOI: https://doi.org/10.1155/2021/8520094
IF: 1.43
2021-08-03
Mathematical Problems in Engineering
Abstract:The essence of enterprise financial modeling is to use mathematical models to classify and sort out all kinds of enterprise information according to the main line of value creation and on this basis to complete the analysis, prediction, and value evaluation of enterprise financial situation. A reasonable financial model is also an effective means to reduce financial fraud. In this paper, a financial fraud identification model is constructed based on empirical data. In the process of model construction, the primary feature set is selected according to the financial fraud motivation theory, and then, the original feature set is obtained by Mann–Whitney test on the primary feature set, and the final fraud identification feature set is selected from the original feature set by using Relief and Boruta algorithms. Finally, based on the final fraud identification feature set, the data algorithms such as decision tree, logistic regression, support vector machine, and random forest are used to identify financial fraud. The experimental results show that the combination of financial fraud identification features constructed by the Relief algorithm and random forest model has the best recognition effect. The evaluation indexes of the G mean value and the F value were 75.86% and 78.33%, respectively.
engineering, multidisciplinary,mathematics, interdisciplinary applications
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper attempts to solve the problem of corporate financial fraud identification. Specifically, the author constructs an intelligent predictive mathematical model of industrial finance fraud based on data - mining techniques to improve the ability to identify financial fraud. By using multiple machine - learning algorithms (such as decision trees, logistic regression, support vector machines, and random forests), this model aims to identify possible financial fraud from a large amount of financial data. ### Main contributions of the paper 1. **Feature selection**: - The author combines the theory of financial fraud motivation and preliminarily screens the original feature set through the Mann - Whitney test. - Further use the Relief and Boruta algorithms to select the final fraud - identification feature set from the original feature set. 2. **Model construction and evaluation**: - Use algorithms such as decision trees, logistic regression, support vector machines, and random forests to construct financial fraud - identification models. - Evaluate the identification effects of different models through cross - validation and multiple evaluation metrics (such as Gmean value and F - value). 3. **Experimental results**: - The experimental results show that the combination of the feature set selected based on the Relief algorithm and the random forest model has the best identification effect, with Gmean value and F - value being 75.86% and 78.33% respectively. ### Background and significance of the paper With the expansion of enterprise scale and the increase in the complexity of financial management, the traditional manual financial management method has been difficult to meet the demand. Financial fraud not only affects the normal operation of enterprises but also may lead to serious economic losses. Therefore, it is of great significance to improve auditors' ability to identify financial fraud. This paper provides an effective means to identify financial fraud by combining data - mining techniques and mathematical modeling methods, thereby reducing losses caused by financial fraud.