Sentiment analysis of the song "mojito"

Shaomin Wang
DOI: https://doi.org/10.1145/3443467.3443846
2020-11-06
Abstract:With the development of the Internet, people share views and opinions on things anytime and anywhere. While receiving information, people also produce various information. Based on the evaluation of Jay Chou's new song mojito by different users on Douban, this paper uses Python's JSON tool to calculate the positive and negative probability value of each comment by setting the probability value of positive tendency greater than 0.5 as positive evaluation, otherwise as negative. In order to understand the reasons for user ratings directly, a word cloud map is drawn based on comment data. On the basis of determining the positive and negative emotional tags, the first step is data processing, such as data cleaning, Chinese word segmentation, removing stop words, text vectorization, etc. Then, three different models of naive Bayes, logistic regression and support vector machine are established for comparison. Finally, naive Bayes model is selected for prediction based on cross validation score. Through confusion matrix evaluation, it is found that the model is more accurate for negative evaluation classification results, but not accurate enough for positive evaluation prediction. This may be related to the expressions of irony and double negation in text reviews.
What problem does this paper attempt to address?