A Semi-Supervised Feature Clustering Algorithm with Application to Word Sense Disambiguation

Zheng-Yu Niu,Dong-Hong Ji,Chew Lim Tan
DOI: https://doi.org/10.3115/1220575.1220689
2005-01-01
Abstract:In this paper we investigate an application of feature clustering for word sense disambiguation, and propose a semisupervised feature clustering algorithm. Compared with other feature clustering methods (ex. supervised feature clustering), it can infer the distribution of class labels over (unseen) features unavailable in training data (labeled data) by the use of the distribution of class labels over (seen) features available in training data. Thus, it can deal with both seen and unseen features in feature clustering process. Our experimental results show that feature clustering can aggressively reduce the dimensionality of feature space, while still maintaining state of the art sense disambiguation accuracy. Furthermore, when combined with a semi-supervised WSD algorithm, semi-supervised feature clustering outperforms other dimensionality reduction techniques, which indicates that using unlabeled data in learning process helps to improve the performance of feature clustering and sense disambiguation.
What problem does this paper attempt to address?