Data-Intensive Inferences Of Large-Scale Bayesian Networks

Kun Yue,Weiyi Liu,Hao Wu,Dapeng Tao,Ming Gao
2018-01-01
Abstract:In this chapter, we are to develop the method for data-intensive probabilistic inferences of uncertain knowledge represented by large-scale Bayesian Networks (BNs), which could be learned by the method presented in Chapter 2 from massive data or given corresponding to complex applications. Efficiency highlighted in real applications based on BN inferences makes it challenging due to the exponential complexity with respect to the scale of nodes in directed acyclic graph (DAG) and parameters in conditional probability tables (CPTs). Thus, we give a parallel inference method for computing joint probability distributions with large-scale BNs using MapReduce by extending the classic algorithm for BN's exact inferences. In the method, we adopt the large-scale BN as a massive dataset stored in distributed systems, and consequently transform the concerned steps in BN inferences into data-intensive operations upon the distributed file storage. Then, we give a case study to discover user similarities in social media by the proposed algorithm for data-intensive probabilistic inferences. Experimental results show the efficiency and effectiveness of the proposed method.
What problem does this paper attempt to address?