Improved forward algorithm for maximum matching word segmentation

Yuan Jian
Abstract:To reduce the error rate of forward maximum matching word segmentation algorithm,analyze the reason of this error rate,an improved forward maximum matching word segmentation algorithm is presented,a processing module crossing ambiguities in the field is increased.Firstly,the text is treated pre-cut,being the maximum matching in the traditional process,crossing ambiguity field processing module is called,the module is mainly positive in every match back after the match,that is,by comparing the mutual information of current word processing and the last words of the word with the mutual information of the end of words and next word to determine the cut points,Finally,the word fragments are processed.By randomly selected for testing,the efficiency of the presented method is demonstrated.
Computer Science
What problem does this paper attempt to address?