SG-WRAP: a Schema-Guided Wrapper Generator

XF Meng,HJ Lu,HY Wang,MZ Gu
DOI: https://doi.org/10.1109/icde.2002.994743
2002-01-01
Abstract:Although wrapper generation work has been reported in the literature, there seem no standard ways to evaluate the performance of such systems. We conducted a series of experiments to evaluate the usability, correctness and efficiency of SG-WRAP. The usability tests selected a number of users to use the system. The results indicated that, with minimal introduction of the system, DTD definition and structure of HTML pages, even naive users could quickly generate wrappers without much difficulty. For correctness, we adapted the precision and recall metrics in information retrieval to data extraction. The results show that, with the refining process, the system can generate wrappers with very high accuracy. Finally, the efficiency tests indicated that the wrapper generation process is fast enough even with large size Web pages.
What problem does this paper attempt to address?