Taureau: A Stock Market Movement Inference Framework Based on Twitter Sentiment Analysis

Nicholas Milikich,Joshua Johnson
2023-03-31
Abstract:With the advent of fast-paced information dissemination and retrieval, it has become inherently important to resort to automated means of predicting stock market prices. In this paper, we propose Taureau, a framework that leverages Twitter sentiment analysis for predicting stock market movement. The aim of our research is to determine whether Twitter, which is assumed to be representative of the general public, can give insight into the public perception of a particular company and has any correlation to that company's stock price movement. We intend to utilize this correlation to predict stock price movement. We first utilize Tweepy and getOldTweets to obtain historical tweets indicating public opinions for a set of top companies during periods of major events. We filter and label the tweets using standard programming libraries. We then vectorize and generate word embedding from the obtained tweets. Afterward, we leverage TextBlob, a state-of-the-art sentiment analytics engine, to assess and quantify the users' moods based on the tweets. Next, we correlate the temporal dimensions of the obtained sentiment scores with monthly stock price movement data. Finally, we design and evaluate a predictive model to forecast stock price movement from lagged sentiment scores. We evaluate our framework using actual stock price movement data to assess its ability to predict movement direction.
Computers and Society,Social and Information Networks,Computational Finance
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper attempts to predict stock market fluctuations by analyzing public sentiment on Twitter. Specifically, the authors propose a framework named **Taureau** that aims to use Twitter sentiment analysis to predict the direction of stock price movements. The main research questions of the paper are: 1. **Can Twitter data reflect public opinion about specific companies?** - The authors hypothesize that public sentiment on Twitter can represent the general public's opinion about specific companies. 2. **Is there a correlation between public sentiment and stock price fluctuations?** - The authors aim to find a correlation between Twitter sentiment data and changes in company stock prices through analysis. 3. **Can this correlation be used to predict stock price movements?** - The authors design a predictive model that uses historical Twitter sentiment data to forecast future stock price movement directions. ### Research Background With the proliferation of smartphones and the internet, social media has become an important channel for obtaining real-time information. Particularly in the stock market, public sentiment and opinions may influence stock price fluctuations. Traditional stock price prediction methods mainly rely on historical data and macroeconomic trends, but recent studies have shown that public sentiment on social media can also provide valuable information. For example, Bollen et al. (2011) found that analyzing public sentiment on Twitter can improve the accuracy of standard stock market prediction models. ### Research Methods 1. **Data Collection and Preprocessing** - Use **Tweepy** and **getOldTweets** libraries to collect historical Twitter data. - Filter and label Twitter data through keyword searches to ensure data relevance and accuracy. 2. **Sentiment Analysis** - Use the **TextBlob** sentiment analysis engine to evaluate user sentiment in tweets, generating subjectivity and polarity scores for each tweet. - Convert tweet text into word vectors and perform sentiment mining using Naive Bayes and Random Forest classifiers. 3. **Data Aggregation and Modeling** - Calculate daily positive and negative sentiment scores and use a moving average method to reduce noise. - Perform correlation analysis between sentiment scores and actual stock price movement data to train the predictive model. 4. **Predictive Model** - Convert sentiment scores into stock price movement percentages, further translating them into trading recommendations (buy, hold, sell). - Train the model using historical data and validate its accuracy on test data. ### Experimental Results - **Correlation Analysis**: Preliminary results show a certain correlation between sentiment scores and stock price movements, especially when using a 3-day moving average, where the correlation is strongest. - **Predictive Performance**: On the test dataset, the model's recommendation accuracy is 80%, significantly higher than the 65% of a random model. ### Conclusion and Future Work - **Main Challenges**: - Encountered rate limits and incomplete data issues with the Twitter API during data collection. - High noise in social media data requires further processing and classification. - The COVID-19 pandemic increased market volatility, adding to the prediction difficulty. - **Future Directions**: - Expand the dataset to increase sample size, improving model reliability and generalization. - Explore more complex sentiment analysis methods to enhance the accuracy of sentiment scores. - Consider incorporating other data sources (e.g., news reports, financial statements) to integrate multiple information sources for prediction. Through these efforts, the Taureau framework is expected to play a greater role in future stock market predictions.