Measuring Consistency in Text-based Financial Forecasting Models

Linyi Yang,Yingpeng Ma,Yue Zhang
2023-06-02
Abstract:Financial forecasting has been an important and active area of machine learning research, as even the most modest advantage in predictive accuracy can be parlayed into significant financial gains. Recent advances in natural language processing (NLP) bring the opportunity to leverage textual data, such as earnings reports of publicly traded companies, to predict the return rate for an asset. However, when dealing with such a sensitive task, the consistency of models -- their invariance under meaning-preserving alternations in input -- is a crucial property for building user trust. Despite this, current financial forecasting methods do not consider consistency. To address this problem, we propose FinTrust, an evaluation tool that assesses logical consistency in financial text. Using FinTrust, we show that the consistency of state-of-the-art NLP models for financial forecasting is poor. Our analysis of the performance degradation caused by meaning-preserving alternations suggests that current text-based methods are not suitable for robustly predicting market information. All resources are available at <a class="link-external link-https" href="https://github.com/yingpengma/fintrust" rel="external noopener nofollow">this https URL</a>.
Computation and Language,Artificial Intelligence,General Economics
What problem does this paper attempt to address?
The paper primarily focuses on the issue of evaluating and improving model consistency in financial prediction models, particularly those based on text. The authors point out that in financial prediction tasks, even a slight predictive advantage can bring significant economic benefits. However, current methods often overlook model consistency—the property that the model output should remain unchanged or change logically when the input changes but the semantics remain the same. This consistency is crucial for building user trust in the model. To fill this research gap, the authors propose a new tool called FinTrust, which defines several logical consistency testing methods and uses these tests to evaluate the performance of current natural language processing (NLP) models in financial prediction. Specifically, the paper defines four types of logical consistency tests: negation consistency, symmetry consistency, addition consistency, and transitivity consistency. Through these tests, the authors find that existing text-based financial prediction models perform poorly when faced with these logical consistency transformations, indicating issues with robustness and credibility in current models. Furthermore, the paper discusses how to use FinTrust to evaluate the implicit biases of pre-trained language models (such as BERT and FinBERT) and assesses the performance degradation of different models in stock trend prediction after logical consistency transformations. Experimental results show that all tested models exhibit significant performance degradation in consistency tests, suggesting that existing models may rely on specific patterns in the training data rather than logical consistency for prediction. In summary, the paper aims to address the issue of insufficient model consistency in the field of financial prediction and proposes a new evaluation tool, FinTrust, to promote further research and development in this area.