Multi-Rack Regenerating Codes for Hierarchical Distributed Storage Systems

Shan Qu,Yu Liu,Jinbei Zhang,Haiwen Cao,Xinbing Wang
DOI: https://doi.org/10.1109/ICC.2018.8422112
2018-01-01
Abstract:Erasure codes provide higher reliability than replication for a same level of redundancy to store data in distributed storage systems, yet with more bandwidth overhead. Recently, regenerating codes are introduced, which significantly reduce the repair bandwidth by analyzing the fundamental tradeoff between storage capacity and repair bandwidth via the information flow graph. In reality, distributed storage systems with hierarchical structures are more common in data centers where data are organized in racks, and the cross-rack communication is more costly than the in-rack communication. Hence, in this paper, we introduce a class of codes to repair a failed node by downloading data from nodes in the same rack only, which are termed as multi-rack regenerating codes (MRC). Different with existing works, the cross-rack repair bandwidth under our codes can be reduced to zero. Meanwhile, we obtain the optimal tradeoff between storage and bandwidth of MRC, and present an explicit construction of MRC with the common product-matrix framework.
What problem does this paper attempt to address?