DSC: Dynamic Stripe Construction for Asynchronous Encoding in Clustered File System

Shuzhan Wei,Yongkun Li,Yinlong Xu,Si Wu
DOI: https://doi.org/10.1109/infocom.2017.8056998
2017-01-01
Abstract:Nowadays many clustered file systems adopt asynchronous encoding which transforms replicated data into erasure coding to maintain data availability with bounded storage overhead. Existing implementations of asynchronous encoding construct coding stripes with logically sequential data blocks, which suffers from heavy cross-rack traffic and necessitates data block redistribution. Recent work [12] solves this problem by carefully distributing replicated data blocks among racks at the time when they are being written, but it is not applicable to the cases when existing systems have different data layouts or the data layout changes. In this paper, we propose Dynamic Stripe Construction (DSC) to transform N-way replication to erasure coding. DSC does not induce to any cross-rack traffic for encoding, and it does not require data block redistribution after encoding. Besides, DSC is general enough to be applied to any existing CFSes with various erasure codes, and it can also be deployed on a distributed file system in a hot-plugging-in manner. To validate the effectiveness of DSC, we implement it on HDFS. Through extensive testbed experiments in a real storage cluster, we show that DSC can significantly increase the encoding throughput and reduce the foreground user response time over the traditional approach.
What problem does this paper attempt to address?