CSeg& Tag1.0

Maosong Sun,Shen Dayang,Changning Huang
DOI: https://doi.org/10.3115/974557.974575
1997-01-01
Abstract:Chinese word segmentation and POS tagging are two key techniques in many applications in Chinese information processing.Great efforts have been paid to the research in the last decade, but unfortunately, no practical system with high performance for unrestricted texts is available up to date.CSeg&Tagl.0,a Chinese word segmenter and POS tagger which unifies these two procedures into one model, is introduced in this paper.The preliminary open tests show that the segmentation precision of CSeg&Tagl.0 is about 98.0% -99.3%, POS tagging precision about 91.0% 97.1%, and the recall and precision for unknown words are ranging from 95.0% to 99.0% and from 87.6% to 95.3% respectively.The processing speed is about 100 characters per second on Pentium 133 PC.The work of improving the performance of the system is still ongoing.
What problem does this paper attempt to address?