A MapReduce-Based Method for Learning Bayesian Network from Massive Data.

Qiyu Fang,Kun Yue,Xiaodong Fu,Hong Wu,Weiyi Liu
DOI: https://doi.org/10.1007/978-3-642-37401-2_68
2013-01-01
Abstract:Bayesian network (BN) is the popular and important probabilistic graphical model for representing and inferring uncertain knowledge. Learning BN from massive data is the basis for uncertain-knowledge-centered inferences, prediction and decision. The inherence of massive data makes BN learning be adjusted to the large data volume and executed in parallel. In this paper, we proposed a MapReduce-based approach for learning BN from massive data by extending the traditional scoring & search algorithm. First, in the scoring process, we developed map and reduce algorithms for obtaining the required parameters in parallel. Second, in the search process, for each node we developed map and reduce algorithms for scoring all the candidate local structures in parallel and selecting the local optimal structure with the highest score. Thus, the local optimal structures of each node are merged to the global optimal one. Experimental result indicates our proposed method is effective and efficient. © 2013 Springer-Verlag.
What problem does this paper attempt to address?