Abstract:When users issue a query to a database, they have expectations about the results. If what they search for is unavailable in the database, the system will return an empty result or, worse, erroneous mismatch results. We call this problem the MisMatch problem. In this paper, we solve the MisMatch problem in the context of XML keyword search. Our solution is based on two novel concepts that we introduce: target node type and Distinguishability. Target Node Type represents the type of node a query result intends to match, and Distinguishability is used to measure the importance of the query keywords. Using these concepts, we develop a low-cost post-processing algorithm on the results of query evaluation to detect the MisMatch problem and generate helpful suggestions to users. Our approach has three noteworthy features: (1) for queries with the MisMatch problem, it generates the explanation, suggested queries and their sample results as the output to users, helping users judge whether the MisMatch problem is solved without reading all query results; (2) it is portable as it can work with any lowest common ancestor-based matching semantics (for XML data without ID references) or minimal Steiner tree-based matching semantics (for XML data with ID references) which return tree structures as results. It is orthogonal to the choice of result retrieval method adopted; (3) it is lightweight in the way that it occupies a very small proportion of the whole query evaluation time. Extensive experiments on three real datasets verify the effectiveness, efficiency and scalability of our approach. A search engine called XClear has been built and is available at http://xclear.comp.nus.edu.sg.

The Interaction Between Schema Matching and Record Matching in Data Integration

Smartint: A Demonstration System For The Interaction Between Schema Mapping And Record Matching

Reserch of Entity Matching Based on Multiple Heterogenous Data

Research of Matching Technology in Data Integration

An Efficient Schema Matching Approach Using Previous Mapping Result Set

ReMatch: Retrieval Enhanced Schema Matching with LLMs

A Generic Algorithm for Heterogeneous Schema Matching

Towards a Composite XML Schema Matching Approach Using Reference Ontology

Record Matching with Non-Key Attribute Values

Matchmaker: Self-Improving Large Language Model Programs for Schema Matching

RI-Match: Integrating Both Representations and Interactions for Deep Semantic Matching.

HISMA: A Human-machine Iterative Schema Matching Algorithm

A General Framework to Resolve the MisMatch Problem in XML Keyword Search

Mining schema matching between heterogeneous databases

GRAM: Generative Retrieval Augmented Matching of Data Schemas in the Context of Data Security

Schema Matching using Machine Learning

In Situ Neural Relational Schema Matcher

Nokearm: Employing Non-Key Attributes In Record Matching

An Approach of Xml Schema Matching Using Top-K Mapping

Schema Matching with Large Language Models: an Experimental Study

Human-in-the-loop Data Integration