Chinese Analyzer for Search Engine-Lucene

胡长春,刘功申
DOI: https://doi.org/10.3778/j.issn.1002-8331.2009.12.051
2009-01-01
Abstract:The word segmentation algorithm of most Chinese analyzers for the Lucene search engine does not meet the Chinese habit.In order to overcome such deficiency,this paper has proposed a new Chinese analyzer based on the maximal match algo-rithm and a standard dictionary.From the experimental results,the proposed word segmentation algorithm of our Chinese analyzer meets the Chinese habit.And its indexing performance is very close to that of the analyzers based on mechanical segmentation.In addition,the retrieval efficiency is greatly improved by 2~4 times and the rate of retrieval response is improved by 59%.
What problem does this paper attempt to address?