Some Problems of Chinese Segmentation

YU Jiangsheng,YU Shiwen
2001-01-01
Abstract:In this paper, we discussed the main problems in Chinese segmentation. Firstly, machine segmentation ambiguity (MSA) was de ned formally. The automatic identi cation of MSA and types of ambiguities was emphasized as a most important step of Chinese segmentation. Then, we summarized the existing algorithms of Chinese segmentation (including the identi cation of Chinese names) with theoretical comparisons. Finally, to reach the statistically best result of segmentation, we proposed dynamic machine learning of lexicon.
What problem does this paper attempt to address?