A Chinese Organization′s Full Name and Matching Abbreviation Algorithm Based on Edit-Distance

黄林晟,邓志鸿,唐世渭,王文清,陈凌
2012-01-01
Abstract:When dealing with the specific problem of a Chinese organization′s full name and matching abbreviation,the traditional string matching algorithm based on edit-distance performs poorly.A new algorithm,also based on edit-distance,was provided.The improvements include the following steps:(1) making the Chinese word segmentation fit the Chinese grammatical structure features,(2) modifying the edit-operation weights with the redefined semantic similarity,(3) adjusting these weights by adaptive learning,and(4) choosing the full name with minimum edit-distance as the matching result.Experimental results show that our algorithm can effectively achieve higher abbreviation-full name matching accuracy.
What problem does this paper attempt to address?