Data source selection with similar theme in Deep Web integrated system

WANG Cheng-liang,SANG Yin-bang
DOI: https://doi.org/10.3969/j.issn.1001-3695.2011.09.045
2011-01-01
Abstract:This paper presented a similar theme of Deep Web data sources selection which could effectively know repeatability of content between new data source and integrated system by differences analysis of the data source,then used precision and recall to construct a quality estimation model for assessing quality of each data source,weakened negative impact of quality assessment that dues to low precision in existing research.Experiment results which use mainstream book sites show that this method can reduce the burden of the system,and obtain higher quality from data sources with similar theme.
What problem does this paper attempt to address?