Improving Chinese Sentiment Analysis via Segmentation-Based Representation Using Parallel CNN.
Yazhou Hao,Qinghua Zheng,YangYang Lan,Yufei Li,Meng Wang,Sen Wang,Chen Li
DOI: https://doi.org/10.1007/978-3-319-69179-4_47
2017-01-01
Abstract:Automatically analyzing sentimental implications in texts relies on well-designed models utilizing linguistic features. Therefore, the models are mostly language-dependent and designed for English texts. Chinese is with the largest users in the world and has a tremendous amount of texts daily generated from the social media, etc. However, it has seldom been studied. On another hand, a general observation, which is valid in many languages, is that different segments of a piece of text, e.g. a clause, having different sentimental polarities. The existing deep learning models neglect the imbalanced sentiment distribution and only take the entire piece of the text. This paper proposes a novel sentiment-analysis model, which is capable of sentiment analysis task in Chinese. Firstly, the model segments a text into smaller units according to the punctuations to obtain the preliminary text representation, and this step is so-called segmentation-based representation. Meanwhile, its new framework parallel-CNN (convolutional neural network) simultaneously use all segments. This model, we call SBR-PCNN, concatenate the representation of each segment to obtain the final representation of the text which does not only contain the semantic and syntactic features but also retains the essential sequential information. The proposed method has been evaluated on two Chinese sentiment classification datasets and compared with a broad range of baselines. Experimental results show that the proposed approach achieves the state of the art results on two benchmarking datasets. Meanwhile, they demonstrate that our model may improve the performance of Chinese sentiment analysis.