RESEARCH ON ROUGH SET AND BAYES-BASED METEOROLOGICAL DATA MINING ON HADOOP PLATFORM

Chenyang Zhang,Zhiqiang Ma,Limin Liu,Jun Chang,Yongli Li
DOI: https://doi.org/10.3969/j.issn.1000-386x.2015.04.017
2015-01-01
Abstract:With the continuous development of meteorological informatisation level,massive meteorological data has been piled up in meteorological departments,how to extract useful knowledge from massive data becomes the focus of attention.Meteorological data has the features of high dimensions and strong dependence,which puts forward higher requirements to meteorological data mining.Classic data mining algorithms cannot achieve better results in performance and accuracy when processing massive meteorological data.On the basis of analysing MapReduce calculation model,rough set theory and Bayesian classification,we propose a MapReduce-based data reduction algorithm and native Bayesian classification algorithm for computing equivalence class.Finally,on Hadoop platform we carry out the correlated experiment. It is demonstrated by the experimental results that this paralleled data mining scheme can efficiently process massive meteorological data and has good scalability.
What problem does this paper attempt to address?