Analyzing Financial Fraud Cases Using a Linguistics-Based Text Mining Approach

Mohamed Zaki,Babis Theodoulidis
DOI: https://doi.org/10.2139/ssrn.2353834
2013-01-01
SSRN Electronic Journal
Abstract:The paper proposes a linguistics-based text mining approach to automate the process of extracting financial concepts from the Security Exchange Commission (SEC) litigation releases. The text mining application includes data source analyzers that extract concepts such as 'litigation release number', 'release publication dates', 'actions', 'document format type', 'document link', 'short description', 'detailed description', and 'plaintiff'. Furthermore, the application includes analyzers to extract concepts such as manipulation participants, type of manipulation, laws and regulations, and timeline manipulation events and actions from the actual SEC complaint fraud case. In order to extract these concepts, this paper used the financial ontology for fraud purposes introduced by [17] to provide underlying framework for the extraction process and capture financial fraud concepts from the SEC litigation releases. Domain-specificity is incorporated through the development of linguistic resources for financial fraud domain in order to make the analysis more effective. The proposed approach demonstrates the extracted information as a knowledge base to facilitate users’ acquisition, maintenance and access to financial fraud knowledge and improving search results in the SEC enforcement portal. When was this manipulative action performed? Where is the manipulator getting his profit from? Finally, a very important implication of the approach is that it addresses the need to reuse the proposed text-mining in other parts of the domain, and to integrate the extracted information with other financial systems such as market monitoring and surveillance systems, crowd monitoring and fraud knowledge management system.
What problem does this paper attempt to address?