Efficient Mining of High-Speed Uncertain Data Streams

Donghong Han,Christophe Giraud-Carrier,Shuoru Li
DOI: https://doi.org/10.1007/s10489-015-0675-9
IF: 5.3
2015-01-01
Applied Intelligence
Abstract:Currently available algorithms for data streams classification are mostly designed to deal with precise and complete data. However, data in many real-life applications is naturally uncertain due to inherent instrument inaccuracy, wireless transmission error, and so on. We propose UELM-MapReduce, a parallel ensemble classifier based on Extreme Learning Machine (ELM) and MapReduce for handling uncertain data streams. We train an efficient parallel ELM-based ensemble classifier from sequential training chunks of the uncertain data streams. The weight of each base classifier in the ensemble is adjusted according to its mean square error on the up-to-date test chunk, and the classifier with the lowest accuracy is replaced. UELM-MapReduce can classify uncertain data streams with both efficiency and accuracy while effectively handling concept drift. Experimental results demonstrate that UELM-MapReduce has better performance than other methods in prediction accuracy and computational efficiency.
What problem does this paper attempt to address?