Harnessing Twitter for Automatic Sentiment Identification Using Machine Learning Techniques

Amiya Kumar Dash,Jitendra Kumar Rout,Sanjay Kumar Jena
DOI: https://doi.org/10.1007/978-81-322-2529-4_53
2015-09-03
Abstract:User generated content on twitter gives an ample source to gathering individuals’ opinion. Because of the huge number of tweets in the form of unstructured text, it is impossible to summarize the information manually. Accordingly, efficient computational methods are needed for mining and summarizing the tweets from corpuses which, requires knowledge of sentiment bearing words. Many computational techniques, models and algorithms are there for identifying sentiment from unstructured text. Most of them rely on machine-learning techniques, using bag-of-words (BoW) representation as their basis. In this paper, we have applied three different machine learning algorithm (Naive Bayes (NB), Maximum Entropy (ME) and Support Vector Machines (SVM)) for sentiment identification of tweets, to study the effectiveness of various feature combination. Our experiments demonstrate that NB with Laplace smoothing considering unigram, Part-of-Speech (POS) as feature and SVM with unigram as feature are effective in classifying the tweets.
What problem does this paper attempt to address?