A Scheme of Word Sense Disambiguation based on Integrated Language Knowledge Base

Jiangsheng Yu,Shiwen Yu
2002-01-01
Abstract:We intend to provide a broad-brush framework of WSD based on the Integrated Language Knowledge Base (ILKB) in the Institute of Computational Linguistics/Peking University, as the guidelines of our imminent project. A well-structured ILKB contains at least a syntactic lexicon, a semantic lexicon and a large-scale segmented/(POS, concept) tagged corpus, in which the relationship between the method of Computational Lexicology and that of Corpus Linguistics is quite clarified. What’s more, the training of concept TagSet along the hypernymy tree is no longer separated from some specific statistical model, such as HMM with two parameters (POS and concept). In short, Statistical Machine Learning will be emphasized in the constructions of both ILKB and TagSet, even throughout WSD project.
What problem does this paper attempt to address?