Fast recovery for large disk enclosures based on RAID2.0: Algorithms and evaluation

Qiliang Li,Min Lyu,Liangliang Xu,Yinlong Xu
DOI: https://doi.org/10.1016/j.jpdc.2024.104854
IF: 4.542
2024-02-08
Journal of Parallel and Distributed Computing
Abstract:The RAID2.0 architecture, which uses dozens or even hundreds of disks, is widely adopted for large-capacity data storage. However, limited resources like memory and CPU cause RAID2.0 to execute batch recovery for disk failures. The traditional random data placement and recovery schemes result in highly skewed I/O access within a batch, which slows down the recovery speed. To address this issue, we propose DR-RAID, an efficient reconstruction scheme that balances local rebuilding workloads across all surviving disks within a batch. We dynamically select a batch of tasks with almost balanced read loads and make intra-batch adjustments for tasks with multiple solutions of reading source chunks. Furthermore, we use a bipartite graph model to achieve a uniform distribution of write loads. DR-RAID can be applied with homogeneous or heterogeneous disk rebuilding bandwidth. Experimental results demonstrate that in offline rebuilding, DR-RAID enhances the rebuilding throughput by up to 61.90% compared to the random data placement scheme. With varied rebuilding bandwidth, the improvement can reach up to 65.00%.
computer science, theory & methods
What problem does this paper attempt to address?