CubicRing: Exploiting Network Proximity for Distributed In-Memory Key-Value Store

Yiming Zhang,Dongsheng Li,Chuanxiong Guo,Haitao Wu,Yongqiang Xiong,Xicheng Lu
DOI: https://doi.org/10.1109/tnet.2017.2669215
2017-01-01
IEEE/ACM Transactions on Networking
Abstract:In-memory storage has the benefits of low I/O latency and high I/O throughput. Fast failure recovery is crucial for large-scale in-memory storage systems, bringing network-related challenges, including false detection due to transient network problems, traffic congestion during the recovery, and top-of-rack switch failures. In order to achieve fast failure recovery, in this paper, we present CubicRing, a distributed structure for cube-based networks, which exploits network proximity to restrict failure detection and recovery within the smallest possible one-hop range. We leverage the CubicRing structure to address the aforementioned challenges and design a network-aware in-memory key-value store called MemCube. In a 64-node 10GbE testbed, MemCube recovers 48 GB of data for a single server failure in 3.1 s. The 14 recovery servers achieve 123.9 Gb/s aggregate recovery throughput, which is 88.5% of the ideal aggregate bandwidth and several times faster than RAMCloud with the same configurations.
What problem does this paper attempt to address?