Uncertain Knowledge Representation And Inference For Lineage Processing Over Uncertain Data

Kun Yue,Weiyi Liu,Hao Wu,Dapeng Tao,Ming Gao
2018-01-01
Abstract:In this chapter, we focus on the representation and processing of lineages over uncertain data, where we adopt Bayesian Network (BN) as the framework of uncertainty representation and inferences. Starting from the lineage expressed as Boolean formulae for Selection-Projection-Join (SPJ) queries over uncertain data, we give a method to transform the lineage expression into directed acyclic graphs (DAGs) equivalently. Specifically, we discuss the corresponding probabilistic semantics and properties to guarantee the correctness of probabilistic inferences theoretically. Then, we propose the function-based method to compute the conditional probability table (CPT) for each node in the DAG. The BN for representing lineage expressions over uncertain data, called lineage BN and abbreviated as LBN, is constructed while generally suitable for both safe and unsafe query plans. Therefore, we give the variable-elimination-based algorithm for LBN's exact inferences to obtain the probabilities of query results, called LBN-based query processing. Then, we focus on obtaining the probabilities of inputs or intermediate tuples conditioned on query results, called LBN-based inference query processing, and give the Gibbs-sampling-based algorithm for LBN's approximate inferences. Experimental results show the efficiency and effectiveness of our methods.
What problem does this paper attempt to address?