Cohesiveness Relationships to Empower Keyword Search on Tree Data on the Web

Aggeliki Dimitriou,Ananya Dass,Dimitri Theodoratos
DOI: https://doi.org/10.48550/arXiv.1508.04957
2015-08-20
Databases
Abstract:Keyword search is the most popular querying technique on semistructured data. Keyword queries are simple and con- venient. However, as a consequence of their imprecision, the quality of their answers is poor and the existing algorithms do not scale satisfactorily. In this paper, we introduce the novel concept of cohesive keyword queries for tree data. Intuitively, a cohesiveness relationship on keywords indicates that they should form a cohesive whole in a query result. Cohesive keyword queries allow term nesting and keyword repetition. Although more expressive, they are as simple as flat keyword queries. We provide formal semantics for cohesive keyword queries rank- ing query results on the proximity of the keyword instances. We design a stack based algorithm which efficiently evaluates cohesive keyword queries. Our experiments demonstrate that our approach outperforms in quality previous filtering semantics and our algorithm scales smoothly on queries of even 20 keywords on large datasets.
What problem does this paper attempt to address?