Textual Data Mining for Financial Fraud Detection: A Deep Learning Approach

Qiuru Li
2023-08-05
Abstract:In this report, I present a deep learning approach to conduct a natural language processing (hereafter NLP) binary classification task for analyzing financial-fraud texts. First, I searched for regulatory announcements and enforcement bulletins from HKEX news to define fraudulent companies and to extract their MD&A reports before I organized the sentences from the reports with labels and reporting time. My methodology involved different kinds of neural network models, including Multilayer Perceptrons with Embedding layers, vanilla Recurrent Neural Network (RNN), Long-Short Term Memory (LSTM), and Gated Recurrent Unit (GRU) for the text classification task. By utilizing this diverse set of models, I aim to perform a comprehensive comparison of their accuracy in detecting financial fraud. My results bring significant implications for financial fraud detection as this work contributes to the growing body of research at the intersection of deep learning, NLP, and finance, providing valuable insights for industry practitioners, regulators, and researchers in the pursuit of more robust and effective fraud detection methodologies.
Computation and Language,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem this paper attempts to address is the detection of financial fraud using Natural Language Processing (NLP) techniques and deep learning methods. Specifically, the authors collected information on companies suspected of fraud from the regulatory announcements and enforcement notices of the Hong Kong Stock Exchange (HKEX). They extracted sentences from the Management Discussion and Analysis (MD&A) reports of these companies and then used various neural network models (including Multilayer Perceptron, vanilla RNN, LSTM, and GRU) for text classification tasks to distinguish between fraudulent and non-fraudulent texts. By comparing the performance of different models, the authors hope to find a more effective method for detecting financial fraud, thereby providing valuable insights for regulatory agencies, financial companies, and researchers.