Anchor text extraction for academic search

Shuming Shi,Fei Xing,Mingjie Zhu,Zaiqing Nie,Ji-Rong Wen
DOI: https://doi.org/10.3115/1699750.1699753
2009-01-01
Abstract:Anchor text plays a special important role in improving the performance of general Web search, due to the fact that it is relatively objective description for a Web page by potentially a large number of other Web pages. Academic Search provides indexing and search functionality for academic articles. It may be desirable to utilize anchor text in academic search as well to improve the search results quality. The main challenge here is that no explicit URLs and anchor text is available for academic articles. In this paper we define and automatically assign a pseudo-URL for each academic article. And a machine learning approach is adopted to extract pseudo-anchor text for academic articles, by exploiting the citation relationship between them. The extracted pseudo-anchor text is then indexed and involved in the relevance score computation of academic articles. Experiments conducted on 0.9 million research papers show that our approach is able to dramatically improve search performance.
What problem does this paper attempt to address?