Cross-lingual sentiment lexicon learning

Dehong Gao
2014-01-01
Abstract:Sentiment lexicon contains a certain number of known-sentiment words (e.g., "good", "nice" and "bad"). It has been widely recognized that sentiment lexicon plays a fundamental role in sentiment analysis. Relative to the existing sentiment lexicons in English, the available sentiment lexicons in the other languages such as Chinese are far from sufficient. This dissertation focuses on Cross-lingual Sentiment Lexicon Learning (CSLL), whose goal is to make full use of the existing sentiment resources from one (or more) language(s) to automatically learn sentiment lexicon(s) for other language(s). The dissertation work makes a systematic study on CSLL. In bilingual graph based sentiment lexicon learning, a bilingual graph is built with the words in English and in a target language for which we want to generate the sentiment lexicon. A label propagation based approach is proposed to transfer the sentiment information from English to the target language. To the best of our knowledge, the word alignment information derived from the parallel corpus is the first time leveraged to build the inter-language relations in CSLL, which is proved to significantly increase the coverage of the learned sentiment lexicon. In this work, the sentiment polarity of a word is determined by the sentiment information of the connected words in the bilingual graph. In Co-training based bilingual sentiment lexicon learning, we consider not only the sentiment information of the connected words, but also the information about the words themselves (e.g., word definitions). From these two types of information, novel and effective features are explored to deduce the sentiment polarity of …
What problem does this paper attempt to address?