URL ordering policies for distributed crawlers: a review

Deepika,Ashutosh Dixit
DOI: https://doi.org/10.48550/arXiv.1611.01228
2015-12-30
Abstract:With the increase in size of web, the information is also spreading at large scale. Search Engines are the medium to access this information. Crawler is the module of search engine which is responsible for download the web pages. In order to download the fresh information and get the database rich, crawler should crawl the web in some order. This is called as ordering of URLs. URL ordering should be done in efficient and effective manner in order to crawl the web in proficient manner. In this paper, a survey is done on some existing methods of URL ordering and at the end of this paper comparison is also carried out among them.
Information Retrieval
What problem does this paper attempt to address?