H-Code: A Hybrid MDS Array Code to Optimize Partial Stripe Writes in RAID-6

Chentao Wu,Shenggang Wan,Xubin He,Qiang Cao,Changsheng Xie
DOI: https://doi.org/10.1109/ipdps.2011.78
2011-01-01
Abstract:RAID-6 is widely used to tolerate concurrent failures of any two disks to provide a higher level of reliability with the support of erasure codes. Among many implementations, one class of codes called {\bfseries{M}}aximum {\bfseries{D}}istance {\bfseries{S}}eparable ({\bfseries{MDS}}) codes aims to offer data protection against disk failures with optimal storage efficiency. Typical MDS codes contain horizontal and vertical codes. Due to the horizontal parity, in the case of \emph{partial stripe write} (refers to I/O operations that write new data or update data to a subset of disks in an array) in a row, horizontal codes may get less I/O operations in most cases, but suffer from unbalanced I/O distribution. They also have limitation on high single write complexity. Vertical codes improve single write complexity compared to horizontal codes, while they still suffer from poor performance in partial stripe writes. In this paper, we propose a new XOR-based MDS array code, named Hybrid Code (H-Code), which optimizes partial stripe writes for RAID-6 by taking advantages of both horizontal and vertical codes. H-Code is a solution for an array of $(p+1)$ disks, where $p$ is a prime number. Unlike other codes taking a dedicated anti-diagonal parity strip, H-Code uses a special anti-diagonal parity layout and distributes the anti-diagonal parity elements among disks in the array, which achieves a more balanced I/O distribution. On the other hand, the horizontal parity of H-Code ensures a partial stripe write to continuous data elements in a row share the same row parity chain, which can achieve optimal partial stripe write performance. Not only within a row but also within a stripe, H-Code offers optimal partial stripe write complexity to two continuous data elements and optimal partial stripe write performance among all MDS codes to the best of our knowledge. Specifically, compared to RDP and EVENODD codes, H-Code reduces I/O cost by up to $15.54%$ and $22.17%$. Overall, H-code has optimal storage efficiency, optimal encoding/decoding computational complexity, optimal complexity of both single write and partial stripe write.
What problem does this paper attempt to address?