DataCloud: an Efficient Massive Data Mining and Analysis Framework on Large Clusters.

Guigang Zhang,Chao Li,Yong Zhang,Chunxiao Xing
DOI: https://doi.org/10.1109/wisa.2012.26
2012-01-01
Abstract:With the development of cloud computing technologies, big data processing is becoming more and more important. How to mine and analyze massive data is facing a very big challenge. In this paper, we proposed an efficient massive data mining and analysis framework Data Cloud on large clusters. The most important part of Data Cloud is the Rabbit. It is a kind of massive data mining and analysis processing plan framework on the large clusters like the Pig and Hive. We make a detail analysis about the Rabbit plan.
What problem does this paper attempt to address?