Word Clustering for Collocation-Based Word Sense Disambiguation

Peng Jin,Xu Sun,Yunfang Wu,Shiwen Yu
DOI: https://doi.org/10.1007/978-3-540-70939-8_24
2007-01-01
Abstract:The main disadvantage of collocation-based word sense disambiguation is that the recall is low, with relatively high precision. How to improve the recall without decrease the precision? In this paper, we investigate a word-class approach to extend the collocation list which is constructed from the manually sense-tagged corpus. But the word classes are obtained from a larger scale corpus which is not sense tagged. The experiment results have shown that the F-measure is improved to 71% compared to 54% of the baseline system where the word-class is not considered, although the precision decreases slightly. Further study discovers the relationship between the F-measure and the number of word-class trained from the various sizes of corpus.
What problem does this paper attempt to address?