Partially Supervised Sense Disambiguation by Learning Sense Number from Tagged and Untagged Corpora.

Zheng-Yu Niu,Dong-Hong Ji,Chew Lim Tan
DOI: https://doi.org/10.3115/1610075.1610134
2006-01-01
Abstract:Supervised and semi-supervised sense disambiguation methods will mis-tag the instances of a target word if the senses of these instances are not defined in sense inventories or there are no tagged instances for these senses in training data. Here we used a model order identification method to avoid the misclassification of the instances with undefined senses by discovering new senses from mixed data (tagged and untagged corpora). This algorithm tries to obtain a natural partition of the mixed data by maximizing a stability criterion defined on the classification result from an extended label propagation algorithm over all the possible values of the number of senses (or sense number, model order). Experimental results on SENSEVAL -3 data indicate that it outperforms SVM , a one-class partially supervised classification algorithm, and a clustering based model order identification algorithm when the tagged data is incomplete.
What problem does this paper attempt to address?