The Research and Design of SQL Processing in a Data-Mining System Based on MapReduce

Lei Zhang,Kaiping Li,Bin Wu
DOI: https://doi.org/10.1109/ccis.2011.6045079
2011-01-01
Abstract:SQL as a database language has been widely used in the modern society. Its function mainly focuses on the data processing, which can be used in data-mining. Due to the rapid growth of data, large-scale data processing is becoming a focal point of information techniques. Though we can still use SQL, but where to store the data and how to get the data efficiently, cost effectively, can be a tricky problem. Cloud computing emerges to solve the problem. It is mainly to deal with large-scale data processing. In this paper, we design a data-mining system which can directly deal with SQL processing based on Hadoop, a parallel store and computing platform. Then we will have a discussion about running time's efficiencies.
What problem does this paper attempt to address?