A Deep Web Query Interface Discovery Method

Bo Liu,Zhenxing Li
DOI: https://doi.org/10.1109/FSKD.2015.7382138
2015-01-01
Abstract:For the purpose of obtaining deep web query interface from forms accurately, this paper proposes a framework of automatic deep web discovery, which includes procedures of collecting web pages, extracting forms and features, filtering forms, and identifying forms. A heuristic rule-based k-nearest neighbor algorithm for identifying the query interfaces is introduced. In the experiments, a number of query interfaces and non-query interfaces from different domains are selected for classifying the query interfaces. Experimental results demonstrate that the presented algorithm can significantly improve the accuracy of deep web query interface discovery.
What problem does this paper attempt to address?