Diversifying repairs of Denial constraint violations

Shuai Li,Yue Zhang,Zijing Tan,Shuai Ma
DOI: https://doi.org/10.1016/j.is.2022.102041
IF: 3.18
2022-09-01
Information Systems
Abstract:Denial constraints (DCs) are expressive enough to subsume many other dependencies, and proven useful in data cleaning for improving data quality. As a complement to the methods of computing a single (nearly) optimum repair of DC violations, in this paper we make the first effort to diversify repairs of DC violations, aiming to generate a set of diversified repairs. (1) We adapt the concept of cardinality-set-minimal repairs to DCs, and relate a cardinality-set-minimal repair to a minimal vertex cover of the conflict hypergraph w.r.t. a given set Σ of DCs on a relational instance I. (2) We formalize the problem of diversifying cardinality-set-minimal repairs of DC violations, and address the problem by presenting a set of algorithms and optimizations to generate a set of diversified minimal vertex covers of the conflict hypergraph. (3) Using both real-life and synthetic data, we conduct extensive experiments to verify the effectiveness and efficiency of our methods.
computer science, information systems
What problem does this paper attempt to address?