Traffic-Aware Erasure-Coded Archival Schemes for In-Memory Stores

Bin Xu,Jianzhong Huang,Xiao Qin,Qiang Cao
DOI: https://doi.org/10.1109/tpds.2020.3009092
IF: 5.3
2020-01-01
IEEE Transactions on Parallel and Distributed Systems
Abstract:Redundancy schemes are introduced to in-memory stores to provide fault tolerance. To achieve good trade-off between access performance and memory efficiency, it is appropriate to adopt replication and erasure coding to keep popular and unpopular data, respectively. Within such a hybrid-redundancy in-memory store, an issue of redundancy transition from replication to erasure coding (a.k.a., erasure-coded archival) should be addressed for unpopular in-memory datasets, since caching workloads exhibit long-tail distributions and most in-memory data are unpopular. If data replicas are distributed across nodes in randomly-selected racks, then subsequent data-block-replica retrieval for erasure-coded archival will create cross-rack traffic, and final parity-block relocation will cause extra cross-rack communications. In this article, we propose an encoding-oriented replica placement policy - ERP - by incorporating an interleaved declustering mechanism. We design two traffic-aware erasure-coded archival schemes -TEA-TL and TEA-SL - for ERP-powered in-memory stores by taking into account temporal locality and spatial locality, respectively. With ERP in place, both TEA-TL and TEA-SL schemes embrace the following three salient features: (i) they alleviate cross-rack traffic raised by retrieving required data-block replicas; (ii) they improve rack-level load balancing by distributing replicas via load-aware primary-rack-selection approach; and (iii) they mitigate block-relocation operations launched to sustain rack-level and node-level fault-tolerance. We conduct quantitative performance evaluations using the YCSB benchmark. The empirical results show that both TEA-TL and TEA-SL schemes not only bring forth lower cross-rack traffic than the four candidate encoding schemes, but also exhibit superb archival-throughput and rack-level-balancing performance. In particular, within a group of comparative tests using the baseline configurations, TEA-TL and TEA-SL accelerate archival throughput by at least 36.3 and 70.8 percent, respectively; both TEA-TL and TEA-SL schemes improve rack-level load-balancing by a factor of more than 1.45x relative to the four candidate encoding schemes.
What problem does this paper attempt to address?