Deep Web Sources Focused Crawling.

LIN Chao,ZHAO Peng-peng,CUI Zhi-ming
DOI: https://doi.org/10.3969/j.issn.1000-3428.2008.07.019
2008-01-01
Abstract:A lot of pages on Internet are generated dynamically by the back-end databases, which can not be reached by the traditional search engines called Deep Web. This paper proposes an algorithm of Deep Web sources focused crawling. When evaluating the importance of hyperlinks, it takes into consideration relevance among page, topic, and link-related information. Experiments indicate that this method is effective.
What problem does this paper attempt to address?