Natural Language Processing and Multimodal Stock Price Prediction

Kevin Taylor,Jerry Ng
2024-01-03
Abstract:In the realm of financial decision-making, predicting stock prices is pivotal. Artificial intelligence techniques such as long short-term memory networks (LSTMs), support-vector machines (SVMs), and natural language processing (NLP) models are commonly employed to predict said prices. This paper utilizes stock percentage change as training data, in contrast to the traditional use of raw currency values, with a focus on analyzing publicly released news articles. The choice of percentage change aims to provide models with context regarding the significance of price fluctuations and overall price change impact on a given stock. The study employs specialized BERT natural language processing models to predict stock price trends, with a particular emphasis on various data modalities. The results showcase the capabilities of such strategies with a small natural language processing model to accurately predict overall stock trends, and highlight the effectiveness of certain data features and sector-specific data.
Machine Learning,Computation and Language
What problem does this paper attempt to address?
This paper attempts to address the problem of how to more accurately predict stock prices using natural language processing (NLP) and multimodal data. Traditional prediction methods often use artificial neural networks (such as LSTM), support vector machines (SVM), and NLP models. However, this paper proposes a new approach that uses the percentage change in stock prices as training data instead of the original currency values. The author believes that the percentage change can provide the context of price volatility for the model, enabling it to better understand the impact of price changes on stocks. In the paper, researchers particularly leverage pre-trained BERT NLP models to analyze publicly available news articles and predict stock trends based on different data modalities. They found that by using simplified data versions that only include key information such as company names, article titles, and percentage changes, the model can predict overall stock trends more effectively. Compared to traditional methods like LSTM, this sentiment analysis-based approach performs better in certain cases, especially in predicting long-term trends. The experimental results indicate that although the model may not be as accurate in individual stocks or short-term predictions, it is able to adapt and predict long-term market trends over time. Furthermore, the paper discusses the impact of different data features such as article sources and dates on model performance, and finds that these pieces of information may not be significant factors for prediction accuracy. In summary, the paper aims to explore more effective strategies for stock price prediction by combining NLP and percentage change data, in order to improve the accuracy of prediction models and understanding of market dynamics.