Aligning LLMs with Human Instructions and Stock Market Feedback in Financial Sentiment Analysis

Zijie Zhao,Roy E. Welsch
2024-10-19
Abstract:Financial sentiment analysis is crucial for trading and investment decision-making. This study introduces an adaptive retrieval augmented framework for Large Language Models (LLMs) that aligns with human instructions through Instruction Tuning and incorporates market feedback to dynamically adjust weights across various knowledge sources within the Retrieval-Augmented Generation (RAG) module. Building upon foundational models like LLaMA 2, we fine-tune a series of LLMs ranging from 7B to 70B in size, enriched with Instruction Tuning and RAG, and further optimized through direct feedback and Reinforcement Learning (RL)-based refinement methods applied to the source weights of <a class="link-external link-http" href="http://RAG.Through" rel="external noopener nofollow">this http URL</a> extensive evaluation, we demonstrate that the sentiment outputs from our LLMs more accurately mirror the intrinsic sentiment of textual data, showcasing a 1% to 6% boost in accuracy and F1 score over existing state-of-the-art models and leading conversational AI systems. Moreover, the sentiments extracted are more indicative of the directions in stock price movements. On top of that, we successfully construct portfolios that yield a 3.61% higher Sharpe ratio compared to the S&P 500 baseline in bullish markets. These portfolios also demonstrate resilience in bearish markets, with a 5x reduction in return losses compared to those typically experienced by the S&P 500.
Computational Engineering, Finance, and Science
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address several key issues in financial sentiment analysis: 1. **Improving the accuracy of large language models (LLMs) in financial sentiment analysis**: - By using Instruction Tuning and Retrieval-Augmented Generation (RAG) techniques, LLMs can better understand human instructions and dynamically adjust the weights of different knowledge sources. - Utilizing market feedback to optimize the weight allocation in the RAG module, thereby improving the model's accuracy in predicting the sentiment of financial texts. 2. **Addressing the mismatch between model pre-training objectives and the specific needs of financial sentiment analysis**: - Enhancing the model's performance in financial sentiment prediction through targeted instruction tuning. - Introducing RAG technology to address the lack of background information in short content such as news briefs and tweets, enhancing the model's contextual understanding with external information. 3. **Exploring the impact of increasing model size on the performance of financial sentiment analysis**: - Studying the performance of LLMs with parameters ranging from 7B to 70B in financial sentiment analysis tasks, exploring the potential impact of model size on task performance. 4. **Validating whether aligning LLMs with market feedback can more accurately reflect future stock price movements**: - Evaluating the model's accuracy in predicting the direction of stock price movements by aligning sentiment predictions with actual market returns. - Constructing investment portfolios to demonstrate the model's performance in bull and bear markets, particularly its resilience in bear markets. ### Main Research Questions (RQs) - **RQ 1**: How can the RAG module of LLMs be enhanced by adaptively and non-uniformly weighting multiple knowledge sources and updating weights based on actual market feedback? - **RQ 2**: What is the impact of increasing model size on the performance of LLMs in financial sentiment analysis? - **RQ 3**: Can aligning LLMs with market feedback produce sentiment predictions that more accurately reflect future stock price movements? ### Method Overview 1. **Instruction Tuning**: Constructing an instruction dataset to fine-tune LLMs, enabling them to better understand and execute complex instructions. 2. **Retrieval-Augmented Generation (RAG)**: Combining external multi-source knowledge bases, using non-uniform weight allocation and Weighted Overlap Coefficient (WOC) to enhance the model's contextual understanding. 3. **Direct Weight Optimization**: Directly adjusting the weights of each knowledge source in the RAG module based on market feedback. 4. **Reinforcement Learning (RL) Weight Optimization**: Using the Proximal Policy Optimization (PPO) algorithm to dynamically adjust the weights of each knowledge source in the RAG module based on market feedback. ### Experimental Results - **Performance Improvement**: Compared to existing state-of-the-art models and leading conversational AI systems, the proposed financial LLMs show higher accuracy and F1 scores in multiple benchmark tests, with improvements ranging from 1% to 6%. - **Portfolio Performance**: The constructed investment portfolio outperforms the S&P 500 benchmark by 3.61% in Sharpe ratio during bull markets and shows significant resilience in bear markets, with a 5-fold reduction in drawdown losses. ### Conclusion This paper successfully addresses several key issues in financial sentiment analysis by proposing an innovative retrieval-enhanced LLMs framework, demonstrating significant advantages in predicting stock price movements and constructing investment portfolios.