Bayesian Estimation‐based Sentiment Word Embedding Model for Sentiment Analysis

Jingyao Tang,Yun Xue,Ziwen Wang,Shaoyang Hu,Tao Gong,Yinong Chen,Haoliang Zhao,Luwei Xiao
DOI: https://doi.org/10.1049/cit2.12037
IF: 7.985
2021-01-01
CAAI Transactions on Intelligence Technology
Abstract:Sentiment word embedding has been extensively studied and used in sentiment analysis tasks. However, most existing models have failed to differentiate high-frequency and low-frequency words. Accordingly, the sentiment information of low-frequency words is insufficiently captured, thus resulting in inaccurate sentiment word embedding and degradation of overall performance of sentiment analysis. A Bayesian estimation-based sentiment word embedding (BESWE) model, which aims to precisely extract the sentiment information of low-frequency words, has been proposed. In the model, a Bayesian estimator is constructed based on the co-occurrence probabilities and sentiment probabilities of words, and a novel loss function is defined for sentiment word embedding learning. The experimental results based on the sentiment lexicons and Movie Review dataset show that BESWE outperforms many state-of-the-art methods, for example, C&W, CBOW, GloVe, SE-HyRank and DLJT1, in sentiment analysis tasks, which demonstrate that Bayesian estimation can effectively capture the sentiment information of low-frequency words and integrate the sentiment information into the word embedding through the loss function. In addition, replacing the embedding of low-frequency words in the state-of-the-art methods with BESWE can significantly improve the performance of those methods in sentiment analysis tasks.
What problem does this paper attempt to address?