CRFs Based Chinese Word Segmentation

kun zhi gui,yong ren,zhao meng peng
DOI: https://doi.org/10.4028/www.scientific.net/AMM.556-562.4376
2014-01-01
Applied Mechanics and Materials
Abstract:Chinese word segmentation is a fundamental problem in natural language processing. CRFs (Conditional Random Fields, CRFs) is an undirected graph model. It can work well with a variety of features, full use of the text information. Thus, this article adopts CRFs based Chinese word segmentation. This paper first gives the definition of CRFs model, the model parameter learning methods and reasoning algorithms. Then, it introduces the word tagging system which is widely used in Chinese word segmentation. The Bakeoff 2005 corpora are used in Chinese word segmentation experiments, and we achieve an excellent result on both MSRA and PKU corpora. The F-Measures on both corpora are 0.964 and 0.943, while the ROOV Values are 0.705 and 0.765.
What problem does this paper attempt to address?