Blending segmentation with tagging in Chinese language corpus processing

Zhou Qiang,Yu Shiwen
DOI: https://doi.org/10.3115/991250.991361
1994-01-01
Abstract:This paper poses a new method for Chinese language corpus processing. Unlike the past researches, our approach has following charactericstics: it blends segmenation with tagging and integrates rule-based approach with statistics-based one in grammatical disambiguation. The principal ideas presented in the paper are incorporated in the development of a Chinese corpus processing system. Experimental results prove that the overall accuracy for segmentation is 97.68% and that for tagging is 94.55% in about 400,000 Chinese characters.
What problem does this paper attempt to address?