Parallel Sentiment Polarity Classification Method with Substring Feature Reduction

Yaowen Zhang,Xiaojun Xiang,Cunyan Yin,Lin Shang
DOI: https://doi.org/10.1007/978-3-642-40319-4_11
2013-01-01
Abstract:Sentiment analysis is an important issue in machine learning, which aims to identify the emotion expressed in corpus. However, sentiment analysis is a difficult task, especially in large-scale data, where feature reduction is needed. In this paper, we propose a parallel feature reduction algorithm for sentiment polarity classification based on a substring method. Specifically, the proposed algorithm is based on parallel computing under the Hadoop platform. The proposed algorithm is examined on a large data set and a K-nearest neighbor algorithm and a Rocchio algorithm are used for classification. Experimental results show that the proposed algorithm outperforms other commonly used methods in terms of the classification performance and the computational cost.
What problem does this paper attempt to address?