Toward Optimal Storage Scaling Via Network Coding: from Theory to Practice.

Xiaoyang Zhang,Yuchong Hu,Patrick P. C. Lee,Pan Zhou
DOI: https://doi.org/10.1109/infocom.2018.8485961
2018-01-01
Abstract:To adapt to the increasing storage demands and varying storage redundancy requirements, practical distributed storage systems need to support storage scaling by relocating currently stored data to different storage nodes. However, the scaling process inevitably transfers substantial data traffic over the network. Thus, minimizing the bandwidth cost of the scaling process is critical in distributed settings. In this paper, we show that optimal storage scaling is achievable in erasure-coded distributed storage based on network coding, by allowing storage nodes to send encoded data during scaling. We formally prove the information-theoretically minimum scaling bandwidth. Based on our theoretical findings, we also build a distributed storage system prototype NCScale, which realizes network-coding-based scaling while preserving the necessary properties for practical deployment. Experiments on Amazon EC2 show that the scaling time can be reduced by up to 50% over the state-of-the-art.
What problem does this paper attempt to address?