Integrating Multiple Knowledge Sources to Disambiguate Word Sense: An Exemplar-Based Approach

Hwee Tou Ng,Hian Beng Lee
DOI: https://doi.org/10.48550/arXiv.cmp-lg/9606032
1996-06-29
Computation and Language
Abstract:In this paper, we present a new approach for word sense disambiguation (WSD) using an exemplar-based learning algorithm. This approach integrates a diverse set of knowledge sources to disambiguate word sense, including part of speech of neighboring words, morphological form, the unordered set of surrounding words, local collocations, and verb-object syntactic relation. We tested our WSD program, named {\sc Lexas}, on both a common data set used in previous work, as well as on a large sense-tagged corpus that we separately constructed. {\sc Lexas} achieves a higher accuracy on the common data set, and performs better than the most frequent heuristic on the highly ambiguous words in the large corpus tagged with the refined senses of {\sc WordNet}.
What problem does this paper attempt to address?