Consistent query answers in inconsistent probabilistic databases.

Xiang Lian,Lei Chen,Shaoxu Song
DOI: https://doi.org/10.1145/1807167.1807202
2010-01-01
Abstract:ABSTRACTEfficient and effective manipulation of probabilistic data has become increasingly important recently due to many real applications that involve the data uncertainty. This is especially crucial when probabilistic data collected from different sources disagree with each other and incur inconsistencies. In order to accommodate such inconsistencies and enable consistent query answering (CQA), in this paper, we propose the all-possible-repair semantics in the context of inconsistent probabilistic databases, which formalize the repairs on the database as repair worlds via a graph representation. In turn, the CQA problem can be converted into one in the so-called repaired possible worlds (w.r.t. both repair worlds and possible worlds). We investigate a series of consistent queries in inconsistent probabilistic databases, including consistent range queries, join, and top-k queries, which, however, need to deal with an exponential number of the repaired possible worlds at high cost. To tackle the efficiency problem of CQA, in this paper, we propose efficient approaches for retrieving consistent query answers, including effective pruning methods to filter out false positives. Extensive experiments have been conducted to demonstrate the efficiency and effectiveness of our approaches.
What problem does this paper attempt to address?