Customized focused crawler for peer-to-peer Web search

Fang Qiming,Yang Guangwen,Wu Yongwei,Zhu Anping,Zheng Weimin
DOI: https://doi.org/10.3321/j.issn:1671-4512.2007.z2.039
2007-01-01
Abstract:A customized focused crawler is proposed to meet the higher demands of peer-to-peer(P2P) Web search.The crawler employs a simple topic description method for convenient customization and uses link navigation based on link structure of Web sites to achieve efficient crawling.The peer can easily customize a lightweight focused crawler with less resource consumption,higher data crawling precision and better manageability by setting a configuration file.Furthermore,a novel data collection updating mechanism for the crawler is presented.This mechanism is a hybrid of incremental update and batch update,which can reduce the complexity of incremental update and have lower overhead compared with batch update.Experimental results indicate that this hybrid updating mechanism can obtain high freshness and recall.
What problem does this paper attempt to address?