Extending emotional lexicon for improving the classification accuracy of Chinese film reviews
Qiaoyun Wang,Guangli Zhu,Shunxiang Zhang,Kuan-Ching Li,Xiang Chen,Hanqing Xu
DOI: https://doi.org/10.1080/09540091.2020.1782839
2020-07-02
Connection Science
Abstract:It is challenging to build domain-specific emotional lexicon for film reviews, due to its unique characteristics, such as massive data, endless new login words, and others. To improve the accuracy of film reviews classification, this article proposes a method for extending emotional lexicon based on word distance and point mutual information. First, using the improved K-means++ algorithm to cluster and select seed words with obvious emotional tendencies. Next, the Distance of Word and Point Mutual Information (DW-PMI) algorithm is presented to determine the emotional polarity of emotional words in the domain of film reviews. Four types of vocabulary, including degree adverb, negation, emoticon and emotion dictionary in the film reviews domain are added to the basic emotion dictionary to extend the film reviews emotional lexicon. From the experimental results, the expanded emotional lexicon of the Chinese film reviews can improve the accuracy and preciseness of the film reviews emotion analysis.
computer science, artificial intelligence, theory & methods