A Comparative Study on Chinese Word Segmentation Using Statistical Models

Meng Wenchao,Liu Lianchen,Chen Anyan
DOI: https://doi.org/10.1109/icsess.2010.5552323
2010-01-01
Abstract:Recent years, character based approaches to Chinese word segmentation task are developed, which show great success. In this paper, a detailed comparison among different statistical models are done. Three models (HMM, MEMM and CRF) are considered. First different tag sets are chosen to evaluate the models' precision and efficiency. Then HMM and MEMM are compared with the similar features. At last different features are compared to measure which feature contributes most to Chinese word segmentation. Finally some suggestion is given for developing Chinese word segmentation systems.
What problem does this paper attempt to address?