Abstract:Disk additions to a RAID-4 storage system can increase the I/O parallelism and expand the storage capacity simultaneously. To regain load balance among all disks including old and new, RAID-4 scaling requires moving certain data blocks onto newly added disks. Existing data redistribution approaches to RAID-4 scaling, restricted by preserving a round-robin data distribution, require migrating all the data, which results in an expensive cost for RAID-4 scaling. In this paper, we propose McPod-a new data redistribution approach to accelerating RAID-4 scaling. McPod minimizes the number of data blocks to be moved while maintaining a uniform data distribution across all data disks. McPod also optimizes data migration with four techniques. First, it coalesces multiple accesses to physically successive blocks into a single I/O. Second, it piggybacks parity updates during data migration to reduce the cost of maintaining consistent parities. Third, it outsources all parity updates brought by RAID scaling to a surrogate disk. Fourth, it delays recording data migration on disks to minimize the number of metadata writes without compromising data reliability. We implement McPod in Linux Kernel 2.6.32.9, and evaluate its performance by replaying three real-system traces. The results demonstrate that McPod outperforms the existing “moving-everything” approach by 67.78-79.64 percent in redistribution time and by 14.24-27.16 percent in user response time. The experiments also illustrate that the performance of the RAID scaled using McPod is almost identical to that of the round-robin RAID.

Redistribute Data to Regain Load Balance During RAID-4 Scaling.

Rethinking Raid-5 Data Layout for Better Scalability

Accelerate RDP RAID-6 Scaling by Reducing Disk I/Os and XOR Operations

Xscale: Online X-Code RAID-6 Scaling Using Lightweight Data Reorganization

A New Parity-Based Migration Method to Expand RAID-5

Design and Evaluation of a New Approach to RAID-0 Scaling

A Behind-the-Scenes Story on Applying Cross-Layer Coordination to Disks and RAIDs

ALV: A New Data Redistribution Approach to RAID-5 Scaling.

RAID+: Deterministic and Balanced Data Distribution for Large Disk Enclosures.

FastScale: Accelerate RAID Scaling by Minimizing Data Migration.

Raid5x: A Performance-Optimizing Scheme Against Double Disk Failures

Fast recovery for large disk enclosures based on RAID2.0: Algorithms and evaluation

RAID-M： A High Performance RAID Matrix Mass Storage

Disaggregated RAID Storage in Modern Datacenters

Determining Data Distribution for Large Disk Enclosures with 3-D Data Templates

RAID-6Plus: A Fast and Reliable Coding Scheme Aided by Multi-failure Degradation.

Disk Tree - A Case Of Parallel Storage Architecture To Improve Performance In Random Access Pattern

PACEMAKER: Avoiding HeART attacks in storage clusters with disk-adaptive redundancy

SLAS: An efficient approach to scaling round-robin striped volumes

FusionRAID: Achieving Consistent Low Latency for Commodity SSD Arrays.

RAID Organizations for Improved Reliability and Performance: A Not Entirely Unbiased Tutorial (1st revision)