Corpus-based method for differentiating genuine and spurious combinational ambiguity

Likun Qiu,Helei Hu,Yunfang Wu
2013-01-01
Abstract:Resolving combinational ambiguity is a crucial issue in Chinese word segmentation. Previous studies mainly discussed how to resolve the word segmentation ambiguity in a certain context yet paid little attention to combinational ambiguities in annotated corpus, all of which have usually been taken as genuine ambiguities. However, some of those ambiguities come from annotation inconsistency. This paper would propose a method for differentiating genuine and spurious combinational ambiguities. Both context similarity and structural similarity are used in the proposed method. These spurious ambiguities might be used to improve the inconsistency of annotated corpus. Experiments show the effectiveness of the proposed method. © 2013 ICIC International.
What problem does this paper attempt to address?