Website Crawling for Specific Topics

LI Gang,ZHOU Li-Zhu,GUO Qi,LIN Ling
DOI: https://doi.org/10.3969/j.issn.1002-137X.2007.02.034
2007-01-01
Computer Science
Abstract:In this paper,we propose a new approach to discover the Websites for special topic in WWW with high precision and low cost.This approach improves traditional Focused Crawler techniques,different from the common Web crawler which accesses the Web graph composed by HTML pages and hyperlinks,our crawler uses Meta-Search to get the URLs of relevant page,then uses heuristic search method to reduce the search cost,and uses topic relevant rules to increase the precision.The experimental results show the presented approach is both effective and efficient.
What problem does this paper attempt to address?