Leveraging Large Data with Weak Supervision for Joint Feature and Opinion Word Extraction

Lei Fang,Biao Liu,Min-Lie Huang
DOI: https://doi.org/10.1007/s11390-015-1569-3
2015-01-01
Abstract:Product feature and opinion word extraction is very important for fine granular sentiment analysis. In this paper, we leverage large-scale unlabeled data for joint extraction of feature and opinion words under a knowledge poor setting, in which only a few feature-opinion pairs are utilized as weak supervision. Our major contributions are two-fold: first, we propose a data-driven approach to represent product features and opinion words as a list of corpus-level syntactic relations, which captures rich language structures; second, we build a simple yet robust unsupervised model with prior knowledge incorporated to extract new feature and opinion words, which obtains high performance robustly. The extraction process is based upon a bootstrapping framework which, to some extent, reduces error propagation under large data. Experimental results under various settings compared with state-of-the-art baselines demonstrate that our method is effective and promising.
What problem does this paper attempt to address?