Fast Recovery Techniques for Erasure-coded Clusters in Non-uniform Traffic Network.

Yunren Bai,Zihan Xu,Haixia Wang,Dongsheng Wang
DOI: https://doi.org/10.1145/3337821.3337831
2019-01-01
Abstract:Nowadays many practical systems adopt erasure codes to ensure reliability and reduce storage overhead. However, erasure codes also bring in low recovery performance. The network links in practice, such as peer-to-peer and cross-data network, always have nonuniform bandwidth because of various reasons. To reduce recovery time, we propose Parallel Pipeline Tree (PPT) and Parallel Pipeline Cross-Tree (PPCT) to speed up single-node and multiple-node recovery in non-uniform traffic network environment, respectively. By utilizing bandwidth gap among links, PPT constructs a tree path based on bandwidth and pipelines the data in parallel. By sharing traffic pressure of requesters with helpers, PPCT constructs a tree-like path and pipelines the data in parallel without additional helpers. We also theoretically explain the effect of PPT and PPCT used in uniform network environment. The experiments implemented on geo-distributed Amazon EC2 show that the time reduction reaches up to 37.2% with PPCT over traditional technique and reaches up to 89.2%, 76.4% and 21.6% with PPT over traditional technique, Partial-Parallel-Repair and Repair Pipelining, respectively. PPT and PPCT significantly improve the performance of erasure codes' recovery.
What problem does this paper attempt to address?