Chinese Sentiment Analysis Exploiting Heterogeneous Segmentations.
Da Pan,Meishan Zhang,Guohong Fu
DOI: https://doi.org/10.1007/978-3-319-47674-2_32
2016-01-01
Abstract:The Chinese language is a character-based language, with no explicit separators between words like English. Traditionally, word segmentation is conducted to convert Chinese sentences into word sequences, thus the same framework of English sentiment analysis can be exploited for Chinese. These work uses a specified word segmentor as a prerequisite step, yet ignores the fact that different segmentation styles exist in Chinese word segmentation, such as CTB, PKU, MSR and etc. In this paper, we study the influences of these heterogeneous segmentations for Chinese sentiment analysis, and then integrate these segmentations, based on both discrete and neural models. Experimental results show that different segmentations do affect the final performances, and the integrated models can achieve better performances.