Chinese Word Segmentation Evaluation Methodology Based on Web Search Engines

WANG Hua-dong,RAO Pei-lun
DOI: https://doi.org/10.3969/j.issn.1007-7634.2007.01.022
IF: 8.1
2007-01-01
Information Sciences
Abstract:Chinese word segmentation is one of the determinants of result quality of Chinese search engines.Whether Chinese words are segmented effectively and correctly is vital to improving the relevance of the searching results and enhancing user satisfaction.The author first reviews the fundamental theories upon which Chinese segmentation evaluation methods are build,and then develops an integrated methodology measuring the quality of Chinese segmentation for web search engine.A set of methods and guidelines are proposed,addressing sampling issues,selection of evaluators,definition and selection of metrics,procedureof the evaluation,and etc.Then the methodology was applied in a real search engine evaluation in practice,and proved to be effective.The result of the evaluation was analyzed,and suggestions concerning evaluator screening and item rejection are provided,with the aim to get a better evaluation performance.
What problem does this paper attempt to address?