On the Implementation of Zigzag Codes for Distributed Storage System

Lijia Lu,Hui Li,Jun Chen,Bing Zhu,Weijun Yin
DOI: https://doi.org/10.1109/bigdata.2015.7363951
2015-01-01
Abstract:Erasure codes such as Reed-Solomon (RS) codes are widely used to improve data reliability in distributed storage systems. Although erasure codes indeed greatly reduce the storage overhead compared to the replication schemes, it is still very costly in terms of network bandwidth when repairing a failed node. To address such problem, we employ the Zigzag code, a MDS array code with optimal repair property, in the practical system. Specifically, we first build a general system on Hadoop to evaluate the encoding, decoding and repair performance of different codes, and then implement Zigzag codes on our system. The experimental results show that the Zigzag codes coincide with the theoretical findings and has certain advantages. Compared to current HDFS modules that use RS codes, our Zigzag based HDFS implementation shows significant reduction of repair disk I/O and repair bandwidth with the same computation complexity.
What problem does this paper attempt to address?