Crawling web pages with application in online advertises monitoring system

Zhengao Xie,Shoubao Su,Huali Xu
DOI: https://doi.org/10.1109/PACCS.2010.5627009
2010-01-01
Abstract:Due to the forms and features of online advertising, an effective web crawling page method, called 'Spider', is designed and implemented by analyzing the information carriers and script codes of web pages. Drawing on the basis of the search engine techniques, a row of heavy method is proposed by employing the preemptive multi-threading technique. It is used to solve the excessive consumption of system resources and network bandwidth in search on the Internet with the Spider to 'crawl' the duplication of information downloaded. © 2010 IEEE.
What problem does this paper attempt to address?