A Quantitative and Comparative Study of Network-Level Efficiency for Cloud Storage Services
Zhenhua Li,Yongfeng Zhang,Yunhao Liu,Tianyin Xu,Ennan Zhai,Yao Liu,Xiaobo Ma,Zhenyu Li
DOI: https://doi.org/10.1145/3274526
2019-01-01
ACM Transactions on Modeling and Performance Evaluation of Computing Systems
Abstract:Cloud storage services such as Dropbox and OneDrive provide users with a convenient and reliable way to store and share data from anywhere, on any device, and at any time. Their cornerstone is the data synchronization (sync) operation, which automatically maps the changes in users' local file systems to the cloud via a series of network communications in a timely manner. Without careful design and implementation, however, the data sync mechanisms could generate overwhelming traffic, causing tremendous financial overhead and performance penalties to both service providers and end users. This article addresses a simple yet critical question: Is the current data sync traffic of cloud storage services efficiently used? We first define a novel metric TUE to quantify the Traffic Usage Efficiency of data synchronization. Then, by conducting comprehensive benchmark experiments and reverse engineering the data sync processes of eight widely used cloud storage services, we uncover their manifold practical endeavors for optimizing the TUE, including three intra-file approaches (compression, incremental sync, and interrupted transfer resumption), two cross-file/-user approaches (i. e., deduplication and peer-assisted offloading), two batching approaches (file bundling and sync deferment), and two web-specific approaches (thumbnail views and dynamic content loading). Our measurement results reveal that a considerable portion of the data sync traffic is, in a sense, wasteful and can be effectively avoided or significantly reduced via carefully designed data sync mechanisms. Most importantly, our study not only offers practical, actionable guidance for providers to build more efficient, traffic-economic services, but also helps end users pick appropriate services that best fit their use cases and budgets.