Abstract:The fast growing of data scale encourages the wide employment of data disks with large storage capacity. However, a mass of data disks' equipment will in turn increase the probability of data loss or damage, because of the appearance of various kinds of disk failures. To ensure the intactness of the hosted data, modern storage systems usually adopt erasure codes, which can recover the lost data by pre-storing a small amount of redundant information. As the most common case among all the recovery mechanisms, the single disk failure recovery has been receiving intensive attentions for the past few years. However, most of existing works still take the stripe-level recovery as their only consideration, and a considerable performance improvement on single failure disk reconstruction in the stack-level (i.e., a group of rotated stripes) is missed. To seize this potential improvement, in this paper we systematically study the problem of single failure recovery in the stack-level. We first propose two recovery mechanism based on greedy algorithm to seek for the near-optimal solution (BP-Scheme and STP-Scheme) for any erasure array code in stack level, and further design a rotated recovery algorithm (RR-Algorithm) to eliminate the size of required memory. Through a rigorous statistic analysis and intensive evaluation on a real system, the results show that BP-Scheme gains 3.4 to 38.9 percent (the average is 21.2 percent) higher recovery speed than Khan's Scheme and 3.4 to 34.8 percent (the average is 19.1 percent) higher recovery speed than Luo's U-Scheme, while STP-Scheme owns 3.4 to 46.9 percent (the average is 25.15 percent) and 3.4 to 41.1 percent (the average is 22.3 percent) higher recovery speed than Khan's Scheme and Luo's U-Scheme, respectively.

OPTIMISING DISK READ FOR NODE FAILURE RECOVERY OF RDP STORAGE SYSTEMS

Reconsidering Single Disk Failure Recovery for Erasure Coded Storage Systems: Optimizing Load Balancing in Stack-Level

SA-RSR: a read-optimal data recovery strategy for XOR-coded distributed storage systems

STORE: Data recovery with approximate minimum network bandwidth and disk I/O in distributed storage systems

A Comprehensive Repair Scheme for Distributed Storage Systems

Dominoes: Speculative Repair in Erasure-Coded Hadoop System.

Deterministic Data Distribution for Efficient Recovery in Erasure-Coded Storage Systems

Study on Data Redundancy Scheme in Kademlia Cloud Storage System

A Delayed Container Organization Approach to Improve Restore Speed for Deduplication Systems.

CORE: Augmenting regenerating-coding-based recovery for single and concurrent failures in distributed storage systems

D3: Deterministic Data Distribution for Efficient Data Reconstruction in Erasure-Coded Distributed Storage Systems

ESetStore: An Erasure-Coded Storage System With Fast Data Recovery

A Practical Cross-Datacenter Fault-Tolerance Algorithm in the Cloud Storage System.

Reliability Provision Mechanism for Large-Scale De-Duplication Storage Systems

Dayu: Fast and Low-interference Data Recovery in Very-large Storage Systems

Review of Data Recovery in Storage Systems Based on Erasure Codes

Boosting Correlated Failure Repair in SSD Data Centers

R-Admad: High Reliability Provision For Large-Scale De-Duplication Archival Storage Systems

An Erasure Code-based Approach to Improve Data Recovery and Update Capability

Optimal Repair Algorithm of Single-Disk Failure for Array Codes with Local Properties

Analysis and Construction of Functional Regenerating Codes with Uncoded Repair for Distributed Storage Systems