A Pragmatic Chinese Word Segmentation System.

Wei Jiang,Yi Guan,Xiaolong Wang
2006-01-01
Abstract:This paper presents our work for participation in the Third International Chinese Word Segmentation Bakeoff. We apply several processing approaches according to the corresponding sub-tasks, which are exhibited in real natural language. In our system, Trigram model with smoothing algorithm is the core module in word segmentation, and Maximum Entropy model is the basic model in Named Entity Recognition task. The experiment indicates that this system achieves Fmeasure 96.8% in MSRA open test in the third SIGHAN-2006 bakeoff.
What problem does this paper attempt to address?