Discovery and Classification Model for Deep Web Sources

马丹,王翰虎,陈梅,张小平
DOI: https://doi.org/10.3969/j.issn.1673-629x.2010.07.017
2010-01-01
Abstract:With the development of Internet,Web is continuously used in our lives.Traditional search engines are only able to reach surface Web except for Deep Web sources.To make use of Deep Web source efficiently,it's urgent that Deep Web sources are found out and classified.This work was focus on Deep Web classification,and a novel classification model was proposed.Its processing including two steps: at first,the model employed features of query interfaces of Deep Web,to recognize whether the Web page was Deep Web,and then,the specific subject of the Deep Web were be identified in the second step by utilize KNN algorithm.The experiments show that the average correct classification rate is 86.9%,and the detailed results are listed in the end of this paper.
What problem does this paper attempt to address?