AZ-Recovery: an Efficient Crossing-AZ Recovery Scheme for Erasure Coded Cloud Storage Systems.

Xin Xie,Chentao Wu,Gen Yang,Zongxin Ye,Xubin He,Jie Li,Minyi Guo,Guangtao Xue,Yuanyuan Dong,Yafei Zhao
DOI: https://doi.org/10.1109/srds51746.2020.00031
2020-01-01
Abstract:As massive data in modern cloud storage systems grow dramatically, it is a common method to partition and store data in multiple Availability Zones (AZs). Multiple AZs not only provide high reliability, but also reduce the network latency. Erasure Codes (ECs) are widely used in multiple AZs to provide high reliability at low storage cost. However, the recovery cost of EC is extremely high in multiple AZs' environment, which is mainly because a normal EC needs to reconstruct the lost data via transferring the data/parities across AZs. Although existing fast recovery approaches can save the I/O cost or network bandwidth in an effective manner, they are not suitable for multiple AZs. The reasons include low flexibility on various complex network scenarios, less consideration on crossing-AZ bandwidth, low capabilities on multiple disk/node failures, etc. To address the above problem, in this paper, we propose a crossing $\underline{\mathrm{A}}$vailability Zone Recovery (AZ-Recovery) method to efficiently improve the recovery performance for multiple AZs. AZ-Recovery investigates the complex homogeneous/heterogeneous network topologies, and finds an optimal data transmission path. Using this method, AZ-Recovery can significantly reduce the recovery cost and save the crossing AZ bandwidth in various failure scenarios. To demonstrate the effectiveness of AZ-Recovery, we evaluate various erasure codes via mathematical analysis and simulations in Network Simulator-3. The results show that, compared to the traditional erasure coding methods, AZ-Recovery saves the recovery bandwidth by up to 77.47%.
What problem does this paper attempt to address?