Try to Substitute: an Unsupervised Chinese Word Sense Disambiguation Method Based on HowNet.

Bairu Hou,Fanchao Qi,Yuan Zang,Xurui Zhang,Zhiyuan Liu,Maosong Sun
DOI: https://doi.org/10.18653/v1/2020.coling-main.155
2020-01-01
Abstract:Word sense disambiguation (WSD) is a fundamental natural language processing task. Unsupervised knowledge-based WSD only relies on a lexical knowledge base as the sense inventory and has wider practical use than supervised WSD that requires a mass of sense-annotated data. HowNet is the most widely used lexical knowledge base in Chinese WSD. Because of its uniqueness, however, most of existing unsupervised WSD methods cannot work for HowNet-based WSD, and the tailor-made methods have not obtained satisfying results. In this paper, we propose a new unsupervised method for HowNet-based Chinese WSD, which exploits the masked language model task of pre-trained language models. In experiments, considering existing evaluation dataset is small and out-of-date, we build a new and larger HowNet-based WSD dataset. Experimental results demonstrate that our model achieves significantly better performance than all the baseline methods. All the code and data of this paper are available at https://github.com/thunlp/SememeWSD.
What problem does this paper attempt to address?