Design and Implementation of a Distributed Web Crawler

Yang Ru
2013-01-01
Abstract:User-specified keywords to generate URL seeds by search engine has been used.Webpage for user's requirements as research corpus through distributed web crawler has been extracted.Experiments show that the distributed web crawler can be good solution to extract a large number of corpora in a short time.
What problem does this paper attempt to address?