Parallel Ensemble of Online Sequential Extreme Learning Machine Based on MapReduce

Shan Huang,Botao Wang,Junhao Qiu,Jitao Yao,Guoren Wang,Ge Yu
DOI: https://doi.org/10.1016/j.neucom.2014.03.076
IF: 6
2016-01-01
Neurocomputing
Abstract:In this era of big data, analyzing large scale data efficiently and accurately has become a challenging problem. As one of the ELM variants, online sequential extreme learning machine (OS-ELM) provides a method to analyze incremental data. Ensemble methods provide a way to learn from data more accurately. MapReduce, which provides a simple, scalable and fault-tolerant framework, can be utilized for large scale learning. In this paper, we first propose an ensemble OS-ELM framework which supports any combination of bagging, subspace partitioning and cross validation. Then we design a parallel ensemble of online sequential extreme learning machine (PEOS-ELM) algorithm based on MapReduce for large scale learning. PEOS-ELM algorithm is evaluated with real and synthetic data with the maximum number of training data 5120K and the maximum number of attributes 512. The speedup of this algorithm reaches as high as 40 on a cluster with maximum 80 cores. The accuracy of PEOS-ELM algorithm is at the same level as that of ensemble OS-ELM executing on a single machine, which is higher than that of the original OS-ELM.
What problem does this paper attempt to address?