When Queueing Meets Coding: Optimal-latency Data Retrieving Scheme in Storage Clouds
Shengbo Chen,Yin Sun,Ulasc C. Kozat,Longbo Huang,Prasun Sinha,Guanfeng Liang,Xin Liu,Ness B. Shroff
DOI: https://doi.org/10.1109/infocom.2014.6848034
2014-01-01
Abstract:Storage clouds, such as Amazon S3, are being widely used for web services and Internet applications. It has been observed that the delay for retrieving data from and placing data into the clouds is quite random, and exhibits weak correlations between different read/write requests. This inspires us to investigate a key problem: can we reduce the delay by transmitting data replications in parallel or using powerful erasure codes? In this paper, we study the problem of reducing the delay of downloading data from cloud storage systems by leveraging multiple parallel threads, assuming that the data has been encoded and stored in the clouds using fixed rate forward error correction (FEC) codes with parameters (n, k). That is., each file is divided into k equal-sized chunks, which are then expanded into n chunks such that any k chunks out of the n are sufficient to successfully restore the original file. The model can be depicted as a multiple-server queue with arrivals of data retrieving requests and a server corresponding to a thread. However, this is not a typical queueing model because a server can terminate its operation, depending on when other servers complete their service (due to the redundancy that is spread across the threads). Hence, to the best of our knowledge, the analysis of this queueing model remains quite uncharted. Real traces from Amazon S3 show that the time to retrieve a fixed size chunk is random and can be accurately approximated as an i.i.d. exponentially distributed random variable. We show that any work-conserving scheme is delay-optimal when k = 1. When k > 1, we find that a simple greedy scheme, which allocates all available threads to the head of line request, is delay optimal, which appears surprising.