A Text Classifier With Domain Adaptation For Sentiment Classification

Wei Chen,Jingyu Zhou
DOI: https://doi.org/10.1007/978-3-642-17187-1_6
2010-01-01
Abstract:In sentiment classification, traditional classification algorithms cannot perform well when the number of labeled data is limited. EM-based Naive Bayes algorithm is often employed to argument the labeled data with the unlabeled ones. However, such an approach assumes the distributions of these two sets of data are identical, which may not hold in practice and often results in inferior performance.We propose a semi-supervised algorithm, called Ratio-Adjusted EM-based Naive Bayes (RAEMNB), for sentiment classification, which combines knowledge from a source domain and limited training instances from a target domain. In RAEMNB, the initial Bayes model is trained from labeled instances from both domains. During each EM iteration, we add an extra R-step to adjust the ratio of predicted positive instances to negative ones, which is approximated with labeled instances of target domain. Experimental results show that our RAEMNB approach outperforms the traditional supervised, semi-supervised classifiers.
What problem does this paper attempt to address?