Chinese Word Segmentation Method Based on Dictionary and Frequency of the Words
ZHANG HENG,YANG WENZHAO,QU JINGHUI,LU HONGBING,ZHANG LIANG,ZHAO FEI
DOI: https://doi.org/10.3969/j.issn.1008-0570.2008.03.095
2008-01-01
Abstract:Chinese word segmentation is the precondition and base of the Chinese information processing.In the Chinese sentences,it has no obvious interval mark between words.It has polyphony and multi-vocal words in Chinese sentences.So,the result of segmentation unavoidable contains ambiguous words.Chinese segmentation is the "bottleneck" of the Chinese information processing.This article uses dictionary and the frequency of the word to decompose the Chinese sentence,and realizes Chinese auto-segmentation,and uses the methods of computing the frequency of words and ambiguous word eliminating to improve the accuracy of word segmentation