Acronym extraction and disambiguation in large-scale organizational web pages.

Shicong Feng,Yuhong Xiong,Conglei Yao,Liwei Zheng,Wei Liu
DOI: https://doi.org/10.1145/1645953.1646206
2009-01-01
Abstract:In this paper, we focus on the automatic extraction and disambiguation of acronyms in large-scale organizational web pages, which is important but difficult due to the diversity of acronyms and the scale of organizational web pages. We propose two novel algorithms to address the key problems in acronym extraction and disambiguation: (1) An unsupervised ranking algorithm to automatically filter out the incorrect acronym-expansion pairs. Different from the existing approaches, our method does not require any hand-crafted rules; (2) A graph-based algorithm to disambiguate ambiguous acronyms, which leverages the hyperlinks of pages to facilitate the acronym disambiguation. We evaluate the proposed approaches using two large-scale, real-world datasets in two different domains. Our experimental results show that our approach is domain independent, and achieves higher precision and recall than the existing methods.
What problem does this paper attempt to address?