Comparative Study of Machine Learning Algorithms for Twitter Sentiment Analysis

Yash Indulkar,Abhijit Patil
DOI: https://doi.org/10.1109/esci50559.2021.9396925
2021-03-05
Abstract:Sentiment Analysis is important to understand various aspects of human emotions through different modes, the modes can be, either by understanding the text or analyzing it for obtaining the desired outputs. The three algorithms considered for sentiment analysis are Logistic Regression, Multinomial Naïve Bayes & Random Forest on the Uber & Ola datasets. The number of tweets extracted from Twitter is 3000. These tweets are cleaned & tokenized using python. The main factor of this paper is Google word2Vec, as the tokenized tweets are transformed with vocabulary from Google Word2Vec. Using this immense dataset of words, helped tokenized words to create a better vocabulary and understanding. Finally, the accuracy and the Mean Cross-Validation Accuracy (MCVA) was generated for all the three algorithms which are used to check if it was giving proper results to the trained data. Visualization was created for understanding the accuracy of three algorithms, which in turn helped to select the most accurate algorithm among others. The programming language used in this for pre-processing & analysis is Python.
What problem does this paper attempt to address?