FETILDA: An Evaluation Framework for Effective Representations of Long Financial Documents

Bolun (Namir) Xia,Vipula Rawte,Aparna Gupta,Mohammed Zaki
DOI: https://doi.org/10.1145/3657299
IF: 4.157
2024-04-10
ACM Transactions on Knowledge Discovery from Data
Abstract:In the financial sphere, there is a wealth of accumulated unstructured financial data, such as the textual disclosure documents that companies submit on a regular basis to regulatory agencies, such as the Securities and Exchange Commission (SEC). These documents are typically very long and tend to contain valuable soft information about a company’s performance that is not present in quantitative predictors. It is therefore of great interest to learn predictive models from these long textual documents, especially for forecasting numerical key performance indicators (KPIs). In recent years, there has been a great progress in natural language processing via pre-trained language models (LMs) learned from large corpora of textual data. This prompts the important question of whether they can be used effectively to produce representations for long documents, as well as how we can evaluate the effectiveness of representations produced by various LMs. Our work focuses on answering this critical question, namely the evaluation of the efficacy of various LMs in extracting useful soft information from long textual documents for prediction tasks. In this paper, we propose and implement a deep learning evaluation framework that utilizes a sequential chunking approach combined with an attention mechanism. We perform an extensive set of experiments on a collection of 10-K reports submitted annually by US banks, and another dataset of reports submitted by US companies, in order to investigate thoroughly the performance of different types of language models. Overall, our framework using LMs outperforms strong baseline methods for textual modeling as well as for numerical regression. Our work provides better insights into how utilizing pre-trained domain-specific and fine-tuned long-input LMs for representing long documents can improve the quality of representation of textual data, and therefore, help in improving predictive analyses.
computer science, information systems, software engineering
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is how to effectively evaluate and utilize pre - trained language models (LMs) to extract useful information from long financial documents in order to improve the performance of prediction tasks. Specifically: 1. **Challenges of Long - Text Representation**: There is a large amount of unstructured text data in the financial field, such as disclosure documents regularly submitted by companies to regulatory agencies (e.g., the U.S. Securities and Exchange Commission, SEC). These documents are usually very long and contain valuable soft information about company performance, which is not reflected in quantitative prediction indicators. Therefore, it is of great significance to learn prediction models from these long texts, especially for predicting numerical key performance indicators (KPIs). 2. **Limitations of Existing Methods**: - Traditional methods such as TF - IDF can represent documents as numerical feature vectors, but they cannot directly extract the latent semantic information in the text. - Existing word - embedding methods (such as word2vec, GloVe) can capture the vocabulary and semantic information of documents to a certain extent, but they learn static representations for each word and ignore the polysemy phenomenon. - State - of - the - art pre - trained language models (such as GPT, BERT) can learn context - related word embeddings, but they face methodological and ontological challenges when dealing with long documents. For example, the BERT model has a maximum length limit of 512 tokens for text sequences, while some parts of financial reports (such as 10 - K reports) may exceed 12,000 words. 3. **Research Objectives**: This paper proposes and implements a deep - learning evaluation framework (FETILDA), which aims to effectively evaluate the performance of different language models in extracting long - text representations by combining chunking techniques and attention mechanisms, thereby improving prediction analysis tasks in the financial field. Specifically, the objectives of the FETILDA framework are: - To address the challenges of long - document representation and ensure the processing of long texts without significant information loss. - To provide better insights into how to use pre - trained domain - specific and fine - tuned long - input language models to improve the representation quality of text data and thus enhance the effectiveness of prediction analysis. 4. **Experimental Verification**: The authors conducted extensive experiments using two different 10 - K report datasets, namely the FIN10K dataset of all U.S. companies and the 10 - K report dataset of U.S. banks, covering multiple regression tasks (such as stock - volatility prediction, key - performance - indicator prediction). The experimental results show that the FETILDA framework significantly outperforms multiple baseline methods in long - financial - text regression tasks and reaches the state - of - the - art level (SOTA). In summary, the main contribution of this paper is to propose a novel evaluation framework that can effectively address the challenges of long - financial - document representation and provide strong support for prediction analysis in the financial field.