SearchWebDB: Data Web Search on a Pay-As-You-Go Integration Infrastructure

Thanh Tran,Haofen Wang,Peter Haase
2009-01-01
Abstract:The Web as a global information space is developing from a Web of documents to a Web of data. This development opens new ways for addressing complex information needs. Search is no longer lim- ited to matching keywords against documents, but instead complex information needs can be expressed in a structured way, with pre- cise answers as results. In this paper, we present SearchWebDB, an infrastructure for data web search that addresses a number of challenges involved in realizing search on the data web: To pro- vide an end-user oriented interface, we support expressive keyword search by translating user information needs into structured queries. We integrate heterogeneous web data sources with automatically computed mappings. Schema-level mappings are exploited in con- structing structured queries against the integrated schema. These structured queries are decomposed into queries against the local web data sources, which are then processed in a distributed way. Finally, heterogeneous result sets are combined using an algorithm called map join, making use of data-level mappings. In evaluation experiments with real life data sets from the data web we show the practicability and scalability of the SearchWebDB infrastructure.
What problem does this paper attempt to address?