Proactive Stripe Reconstruction to Improve Cache Use Efficiency of SSD-Based RAID Systems
Zhibing Sha,Jiaojiao Wu,Jun Li,Balazs Gerofi,Zhigang Cai,Jianwei Liao
DOI: https://doi.org/10.1145/3609099
2023-01-01
ACM Transactions on Embedded Computing Systems
Abstract:Solid-State Drives (SSDs) exhibit different failure characteristics compared to conventional hard disk drives. In particular, the Bit Error Rate (BER) of an SSD increases as it bears more writes. Then, Parity-based Redundant Array of Inexpensive Disks (RAID) arrays composed from SSDs are introduced to address correlated failures. In the RAID-5 implementation, specifically, the process of parity generation (or update) associating with a data stripe, consists of read and write operations to the SSDs. Whenever a new update request comes to the RAID system, the related parity must be also updated and flushed onto the RAID component of SSD. Such frequent parity updates result in poor RAID performance and shorten the life-time of the SSDs. Consequently, a DRAM cache is commonly equipped accompanying with the RAID controller, called the parity cache, and used to buffer the parity chunks that are most frequently updated data, for boosting I/O performance. To better improve the use efficiency of the parity cache, this paper proposes a stripe reconstruction approach to minimize the number of parity updates on SSDs, thus boosting I/O performance of the SSD RAID system. When the currently updated stripe has both cold and hot updated data chunks, it will proactively carry out stripe reconstruction if we can find another matched stripe that also includes cold and hot update data chunks on the complementary RAID components. In the reconstruction process, we first group the cold data chunks of two matched stripes, to build a new stripe and flush the parity chunk on the RAID component. After that, the hot data chunks are organized as a new stripe as well, and its parity chunk is buffered in the parity cache. This results in better cache use efficiency, as it can reduce the number of parity updates on RAID components of SSDs, as well as proactively free up cache space for quickly absorbing subsequent write requests. In addition, the proposed method adjusts the target SSD of write requests based on stripe reconstructions through considering the I/O workload balance of all SSDs. Experimental results show that our proposal can reduce the number of parity chunk updates in SSDs by 2.3% and overall I/O latency by 12.2% on average, compared to state-of-the-art parity cache management techniques.