Moving big data to the cloud

Linquan Zhang,Chuan Wu,Zongpeng Li,Chuanxiong Guo,Minghua Chen,Francis C. M. Lau
DOI: https://doi.org/10.1109/INFCOM.2013.6566804
2013-01-01
Abstract:Cloud computing, rapidly emerging as a new computation paradigm, provides agile and scalable resource access in a utility-like fashion, especially for the processing of big data. An important open issue here is how to efficiently move the data, from different geographical locations over time, into a cloud for effective processing. The de facto approach of hard drive shipping is not flexible, nor secure. This work studies timely, cost-minimizing upload of massive, dynamically-generated, geodispersed data into the cloud, for processing using a MapReducelike framework. Targeting at a cloud encompassing disparate data centers, we model a cost-minimizing data migration problem, and propose two online algorithms, for optimizing at any given time the choice of the data center for data aggregation and processing, as well as the routes for transmitting data there. The first is an online lazy migration (OLM) algorithm achieving a competitive ratio of as low as 2.55, under typical system settings. The second is a randomized fixed horizon control (RFHC) algorithm achieving a competitive ratio of 1+ 1/l+λ κ/λ with a lookahead window of l, where κ and λ are system parameters of similar magnitude.
What problem does this paper attempt to address?