Efficient Top-K Algorithm for Extensible Markup Language Keyword Search

H. Yu,Z. -H. Deng,N. Gao
DOI: https://doi.org/10.1049/iet-sen.2011.0082
2012-01-01
IET Software
Abstract:The ability to compute top-k matches to eXtensible Markup Language (XML) queries is gaining importance owing to the increasing of large XML repositories. Current work on top-k match to XML queries mainly focuses on employing XPath, XQuery or NEXI as the query language, whereas little work has concerned on top-k match to XML keyword search. In this study, the authors propose a novel two-layer-based index construction and associated algorithm for efficiently computing top-k results for XML keyword search. Our core contribution, the two-layer-based inverted Index and associated algorithm for XML keyword search take both score-sorted-sequence and Dewey ID-sorted-sequence into consideration, and thus gain performance benefits during querying process. The authors have conducted expensive experiments and our experimental results show efficiency advantages compared with existing approaches.
What problem does this paper attempt to address?