Abstract:When users issue a query to a database, they have expectations about the results. If what they search for is unavailable in the database, the system will return an empty result or, worse, erroneous mismatch results. We call this problem the MisMatch problem. In this paper, we solve the MisMatch problem in the context of XML keyword search. Our solution is based on two novel concepts that we introduce: target node type and Distinguishability. Target Node Type represents the type of node a query result intends to match, and Distinguishability is used to measure the importance of the query keywords. Using these concepts, we develop a low-cost post-processing algorithm on the results of query evaluation to detect the MisMatch problem and generate helpful suggestions to users. Our approach has three noteworthy features: (1) for queries with the MisMatch problem, it generates the explanation, suggested queries and their sample results as the output to users, helping users judge whether the MisMatch problem is solved without reading all query results; (2) it is portable as it can work with any lowest common ancestor-based matching semantics (for XML data without ID references) or minimal Steiner tree-based matching semantics (for XML data with ID references) which return tree structures as results. It is orthogonal to the choice of result retrieval method adopted; (3) it is lightweight in the way that it occupies a very small proportion of the whole query evaluation time. Extensive experiments on three real datasets verify the effectiveness, efficiency and scalability of our approach. A search engine called XClear has been built and is available at http://xclear.comp.nus.edu.sg.

I/O and CPU Balanced XML Keyword Retrieval

Indexing Techniques For Query Of Xml Documents

Efficient indexing and querying algorithm for large-scale XML data

Highly Efficient Processing of XML Path/twig Queries Using Index Caches.

Greedy Cached Query Rewriting in Content-Oriented XML Web Engine

Effective keyword search in XML documents based on MIU

No Tag, a Little Nesting, and Great XML Keyword Search.

A General Framework to Resolve the MisMatch Problem in XML Keyword Search

Processing Keyword Search on XML: a Survey

Efficient Keyword Search Over Data-Centric Xml Documents

Adaptive And Effective Keyword Search For Xml

Querying Techniques for XML Data

Reasoning and identifying relevant matches for XML keyword search

Foundation Of Keyword Search In Xml

XTree: A New XML Keyword Retrieval Model

Keyword Searches in Data-Centric XML Documents Using Tree Partitioning

Xistree: Bottom-Up Method Of Xml Indexing

Buffer Management of the Results of Queries over XML Streams

SAIL: Structure-aware Indexing for Effective and Progressive Top-K Keyword Search over XML Documents

Efficient XML Keyword Search: From Graph Model to Tree Model.

Keyword Search on XML Data