Discovering Matching Dependencies

Shaoxu Song,Lei Chen
DOI: https://doi.org/10.1145/1645953.1646135
2009-01-01
Abstract:Matching dependencies (MDs) are recently proposed for various data quality applications such as detecting the violation of integrity constraints and duplicate object identification. In this paper, we study the problem of discovering matching dependencies for a given database instance. First, we formally define the measures, support and confidence, for evaluating the utility of MDs in the given database instance. Then, we study the discovery of MDs with certain utility requirements of support and confidence. Exact algorithms are developed, together with pruning strategies to improve the time performance. Finally, our experimental evaluation demonstrates the efficiency of the proposed methods.
What problem does this paper attempt to address?