An Approximate String Matching Algorithm for Chinese Information Retrieval Systems

Jing-fan WANG,Xiao-jun WU,Yun-qing XIA,Fang ZHENG
DOI: https://doi.org/10.3969/j.issn.1003-0077.2007.06.009
2007-01-01
Abstract:In the modern Chinese information retrieval systems,classical keyword based string matching can not work when the input string is different from the entries in the database.This paper proposed a method based on Tarhio and Ukkonen's filtering algorithm to solve the problem.Because the Chinese Pinyin typewriting usually consists Chinese characters with the same or similar pronunciations,we defined a special Edit Distance and expended our method accordingly.The experimental results showed that our algorithm can improve the recall rate of the retrieval systems and obtain practical sub-linear complexity.
What problem does this paper attempt to address?