Optimizing Feature Set for Chinese Word Sense Disambiguation.

Zheng-Yu Niu,Dong-Hong Ji,Chew Lim Tan
2004-01-01
Abstract:This article describes the implementation of I2R word sense disambiguation system (I2R −WSD) that participated in one senseval3 task: Chinese lexical sample task. Our core algorithm is a supervised Naive Bayes classifier. This classifier utilizes an optimal feature set, which is determined by maximizing the cross validated accuracy of NB classifier on training data. The optimal feature set includes partof-speech with position information in local context, and bag of words in topical context.
What problem does this paper attempt to address?