Design and implementation of a full distributed web crawler

Zhu Kunpeng,Wang Xiaolong,Liu Yuanchao
2009-01-01
Journal of Computational Information Systems
Abstract:Distributed Web crawlers have recently received more and more attention from researchers. Full decentralized crawler without a centralized managing server seems to be an interesting architectural paradigm for realizing large scale information collecting systems for its scalability, failure resilience and increased autonomy of nodes. This paper provides a novel full distributed Web crawler system which is based on structured network, and a distributed crawling model is developed and applied in it which improves the performance of the system. Some important issues such as assignment of tasks, solution of scalability have been discussed. Finally, an experimental study is used to verify the advantages of system, and the results are comparatively satisfying. © 2009 Binary Information Press August, 2009.
What problem does this paper attempt to address?