Chinese Word POS Tagging with Markov Logic.

Zhihua Liao,Qixian Zeng,Qiyun Wang
DOI: https://doi.org/10.1007/978-3-319-18455-5_7
2015-01-01
Abstract:Because Chinese language has many unique characteristics, this leads to many conventional machine learning approaches of word POS tagging to be unsatisfactory and inefficient when applying to Chinese POS tagging task. Accordingly, the performance of the state-of-the-art Chinese word POS tagging is not as good as that for other languages such as English. In this paper, we present a novel Chinese word POS tagging method. We first assume that the character-based Chinese sentences are segmented to words completely before hand and phonetic notation - pinyin of these corresponding words is also accomplished. Then, we employ Markov Logic Networks (MLNs) to identify the Chinese word POS tags. It can describe or represent easily and flexibly these rich Chinese grammar structure. Furthermore, for considering the requirements of experiment and contrast, we build two benchmark datasets i.e., dataset1 and dataset2. Both datasets correspond to two different sentence types. The dataset1 and dataset2 consist of short sentences and long ones, respectively. Consequently, the experimental results demonstrate that our approach enhances significantly the state-of-the-art performance compared with other POS tagging methods on these datasets with different sentence types.
What problem does this paper attempt to address?