Massively parallel learning of Bayesian networks with MapReduce for factor relationship analysis

Chen Wei,Wang Tengjiao,Yang Dongqing,Lei Kai,Liu Yueqin
DOI: https://doi.org/10.1109/IJCNN.2013.6706814
2013-01-01
Abstract:Bayesian Network (BN) is one of the most popular models in data mining technologies. Most of the algorithms of BN structure learning are developed for the centralized datasets, where all the data are gathered into a single computer node. They are often too costly or impractical for learning BN structures from large scale data. Through a simple interface with two functions, map and reduce, MapReduce facilitates parallel implementation of many real-world tasks such as data processing for search engines and machine learning. In this paper, we present a parallel algorithm for BN structure leaning from large-scale dateset by using a MapReduce cluster. We discuss the benefits of using MapReduce for BN structure learning, and demonstrate the performance of this approach by applying it to a real world financial factor relationships learning task from the domain of financial analysis. © 2013 IEEE.
What problem does this paper attempt to address?