An Efficient Adaptive Focused Crawler Based on Ontology Learning

C Su,Y Gao,JM Yang,B Luo
DOI: https://doi.org/10.1109/ichis.2005.19
2005-01-01
Abstract:The enormous growth of the world wide web in recent years has made it important to perform resource discovery efficiently. Consequently, several new ideas have been proposed; among them a key technique is focused crawling which is able to crawl particular topical portions of the world wide web quickly without having to explore all web pages. In this paper we present an intelligent focused crawler algorithm in which we embeds ontology to evaluate the page's relevance to the topic. Compared with other algorithms using domain knowledge, our algorithm can evolve the ontology automatically during crawl process. Considering the instinct characteristics of the ontology, propagation has also been imported to accelerate the evolution of the ontology. We applied our approaches in several tasks and provided an empirical evaluation which has shown promising results.
What problem does this paper attempt to address?