Research on detecting and classifying Deep Web interfaces

ZHANG Liang,LU Yu-liang,LIU Jin-hong
DOI: https://doi.org/10.3969/j.issn.1001-3695.2009.12.083
2009-01-01
Abstract:Traditional method using library to match those labels is limited to the integrity of the library and the scalability of the matching algorithm.In order to break through this limitation,this paper introduced a bilateral-layer model based on the statistic characteristics of the interfaces to detect Deep Web entries and text classification approach to classify them.Meanwhile,it provided and applied two methods of computing feature-weight to feature selection.The test results got from TEL-8 Query Interfaces showed the superiority of bilateral-layer classification model and the necessity of dimensionality reduction.
What problem does this paper attempt to address?