Chinese Word Segmentation with Character Abstraction.

Le Tian,Xipeng Qiu,Xuanjing Huang
DOI: https://doi.org/10.1007/978-3-642-41491-6_4
2013-01-01
Abstract:Chinese word segmentation is an important and necessary problem to analyze Chinese texts. In this paper, we focus on the primary challenges in Chinese word segmentation: low accuracy of out-of-vocabulary word. To resolve this difficult problems, we group the "similar" characters to generate more abstract representation. Experimental results show that character abstraction yields a significant relative error reduction of 24.83% in average over the state-of-the-art baseline.
What problem does this paper attempt to address?