Scale-RS: an Efficient Scaling Scheme for RS-Coded Storage Clusters

Jianzhong Huang,Xianhai Liang,Xiao Qin,Ping Xie,Changsheng Xie
DOI: https://doi.org/10.1109/tpds.2014.2326156
IF: 5.3
2015-01-01
IEEE Transactions on Parallel and Distributed Systems
Abstract:It is indispensable to scale erasure-coded storage clusters to meet requirements of increased storage capacity and I/O performance. In this study, we propose an efficient scaling scheme for Reed-Solomon-coded storage clusters called Scale-RS, which has three salient features. First, Scale-RS achieves uniform data distribution by equally placing data blocks among old and new chunks using a transposed data layout. Second, Scale-RS minimizes data movement incurred in the procedures of data redistribution and parity update. Scale-RS not only reaches the lower bound of data migration traffic by transferring necessary data blocks from old data chunks to new chunks, but it also reduces update traffic via generating parity difference blocks from data blocks stored in an individual data chunk. Third, Scale-RS improves the I/O performance of scaled storage clusters in terms of read parallelism and write throughput. We implement Scale-RS along with two alternative scaling schemes in a Reed-Solomon-coded storage cluster, on which real-world I/O traces are replayed. Experimental results demonstrate that Scale-RS achieves the highest read performance among the three scaling schemes after data redistribution. When it comes to scaling from six data chunks to nine, Scale-RS can outperform the other two scaling schemes in terms of aggregate write throughput by a factor of 2.85 and 3.05 under online filling and offline filling, respectively. We also show that user response time is slightly enlarged during data redistribution due to bandwidth competition between migration and user I/Os.
What problem does this paper attempt to address?