A Study in Dictionary-Based All-word Word Sense Disambiguation for Pre-Qin Chinese

ZHANG Yingjie,LI Bin,CHEN Jiajun,CHEN Xiaohe
DOI: https://doi.org/10.3969/j.issn.1003-0077.2012.03.012
2012-01-01
Abstract:Word Sense Disambiguation(WSD) is a basic task of Natural Language Processing,including the processing of ancient Chinese documents.In this paper we focuse on the specific field of analyzing pre-Qin ancient Chinese documents.Considering the shortage of training data and semantic resources,we employe a semi-supervised machine learning method to perform all-word WSD of Zuo Zhuan and use Chinese Dictionary v2.0 as the knowledge resource.We randomly selecte 22 words of different frequency and sense number to evaluate the proposed method.On the selected words,our method achieves an average accuracy of 67%,which is significant higher than the baseline method of selecting the most frequent sense.This method is promising for sense tagging of ancient Chinese documents when there is no training data available.It also provides a raw sense tagging result for human correction,enriching traditional dictionaries which usually suffer from insufficient word sense entries.
What problem does this paper attempt to address?