Recording How-Provenance on Probabilistic Databases.

Ming Gao,Xiangnan He,Cheqing Jin,Xiaoling Wang,Aoying Zhou
DOI: https://doi.org/10.1109/apweb.2010.19
2010-01-01
Abstract:Tracking data provenance (or lineage) has become increasingly important in many large-scale applications, and a few methods have been proposed to record data provenance recently. However, most of previous works mainly focus on deterministic databases except Trio style lineage that aims at probabilistic databases, which is much more challenging because of the exponential growth of possible world instances and dependence among intermediate tuples. This paper proposes an approach, named PHP-tree, to model how-provenance upon probabilistic databases. we also show how to evaluate probability based on a PHP-tree. Compared with Trio style lineage, our approach is independent of intermediate results and can calculate the probability both cases of restricted and complete propagation of data provenance. Detailed experimental results show the effectiveness, efficiency and scalability of our proposed model.
What problem does this paper attempt to address?