Design and Implementation of a Web Page-gathering Tool

潘春华,常敏,武港山
DOI: https://doi.org/10.3969/j.issn.1001-3695.2002.06.050
2002-01-01
Abstract:With the growth of Internet and the fact that information on Web are becoming abundant , Internet has become new stage of traditional information processing. Before processing these web information,people often download the distributed web information to local storage for additional processing,which is the core function of the information-gathering system described in this paper. This system makes use of the links between pages and content of these pages to gather needed information. It can support specific information gathering using a multiple-grade filter. It can also use multiple machines to boost the gathering efficiency. It supports large-scale information gathering , using large-scale database to store the meta information during gathering process and compressing downloaded pages. It can timely update local web copies using dynamic updating mechanism.
What problem does this paper attempt to address?