Research on Building Lexicon for Sentiment Analysis Based on the Chinese Microblogging Smiley

Bin GUI,Xiao-ping YANG,Zhong-xia ZHANG,Wen-tao XIAO
DOI: https://doi.org/10.15918/j.tbit1001-0645.2014.05.020
2014-01-01
Abstract:A method for automatically building sentiment lexicon based on microblogging smiley was proposed.Firstly,a large number of microtext was crawled with emotions from the microblogging platform,the sentiment tendency was annotated based on the micro-smiley to generate emotion corpus.After some preprocessing such as segmentation and duplication removal have been done for the corpus,the sentiment word was then exacted according to rules of part of speech,statistics for each positive and negative emotion words in the corpus to calculate the sentiment value of the word chi-square statistic obtained emotional intensity;according to the positive and negative emotion words appear in the text microblogging the probability of emotional words tendentious was determined,thereby emotion dictionary was generated.This is a new way of thinking.With artificial sentiment dictionary marked as baseline data,the experimental results show that the accuracy of the proposed method marked the emotional words is about 80%,and when the intensity threshold of emotional words is 20,30,it gets the best F-value of generated emotional dictionary,reaching more than 82%.
What problem does this paper attempt to address?