Trillion Dollar Words: A New Financial Dataset, Task & Market Analysis

Agam Shah,Suvan Paturi,Sudheer Chava
2023-05-14
Abstract:Monetary policy pronouncements by Federal Open Market Committee (FOMC) are a major driver of financial market returns. We construct the largest tokenized and annotated dataset of FOMC speeches, meeting minutes, and press conference transcripts in order to understand how monetary policy influences financial markets. In this study, we develop a novel task of hawkish-dovish classification and benchmark various pre-trained language models on the proposed dataset. Using the best-performing model (RoBERTa-large), we construct a measure of monetary policy stance for the FOMC document release days. To evaluate the constructed measure, we study its impact on the treasury market, stock market, and macroeconomic indicators. Our dataset, models, and code are publicly available on Huggingface and GitHub under CC BY-NC 4.0 license.
Computation and Language,Computational Finance
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to develop a new task - hawkish - dovish classification, in order to better understand and quantify the impact of the monetary policy stance of the Federal Open Market Committee (FOMC) in the United States on the financial market. Specifically, the paper aims to: 1. **Construct a large - scale labeled dataset**: The paper constructs the largest - scale tokenized and annotated text dataset of FOMC speeches, meeting minutes and press conferences, so as to study how monetary policy affects the financial market. 2. **Develop a new classification task**: The paper proposes a new task, namely hawkish - dovish classification, to distinguish the monetary policy stance in the documents released by the FOMC. This is different from the traditional positive - negative sentiment analysis, which cannot accurately capture the policy stance. 3. **Evaluate the performance of different models**: The paper uses various pre - trained language models (such as RoBERTa - large) to benchmark the proposed task and evaluate the performance of these models on the classification task. 4. **Construct a monetary policy stance indicator**: Use the best model (RoBERTa - large) to construct an indicator to measure the monetary policy stance on the day when FOMC documents are released, and study the impact of this indicator on the treasury bond market, the stock market and macro - economic indicators. Through these efforts, the paper hopes to provide a more accurate method to understand and predict the potential impact of the FOMC's monetary policy stance on the financial market.