An Improved CRF Based Chinese Language Processing System for SIGHAN Bakeoff 2007.

Xihong Wu,Xiaojun Lin,Xinhao Wang,Chunyao Wu,Yaozhong Zhang,Dianhai Yu
2008-01-01
Abstract:This paper describes three systems: the Chinese word segmentation (WS) system, the named entity recognition (NER) system and the Part-of-Speech tagging (POS) system, which are submitted to the Fourth International Chinese Language Processing Bakeoff. Here, Conditional Random Fields (CRFs) are employed as the primary models. For the WS and NER tracks, the ngram language model is incorporated in our CRFs based systems in order to take into account the higher level language information. Furthermore, to improve the performances of our submitted systems, a transformationbased learning (TBL) technique is adopted for post-processing.
What problem does this paper attempt to address?