AsymNVM

Teng Ma,Mingxing Zhang,Kang Chen,Zhuo Song,Yongwei Wu,Xuehai Qian
DOI: https://doi.org/10.1145/3373376.3378511
2020-01-01
Abstract:The byte-addressable non-volatile memory (NVM) is a promising technology since it simultaneously provides DRAM-like performance, disk-like capacity, and persistency. The current NVM deployment with byte-addressability is \em symmetric, where NVM devices are directly attached to servers. Due to the higher density, NVM provides much larger capacity and should be shared among servers. Unfortunately, in the symmetric setting, the availability of NVM devices is affected by the specific machine it is attached to. High availability can be achieved by replicating data to NVM on a remote machine. However, it requires full replication of data structure in local memory --- limiting the size of the working set. This paper rethinks NVM deployment and makes a case for the \em asymmetric byte-addressable non-volatile memory architecture, which decouples servers from persistent data storage. In the proposed \em \anvm architecture, NVM devices (i.e., back-end nodes) can be shared by multiple servers (i.e., front-end nodes) and provide recoverable persistent data structures. The asymmetric architecture, which follows the industry trend of \em resource disaggregation, is made possible due to the high-performance network (e.g., RDMA). At the same time, \anvm leads to a number of key problems such as, still relatively long network latency, persistency bottleneck, and simple interface of the back-end NVM nodes. We build \em \anvm framework based on \anvm architecture that implements: 1) high performance persistent data structure update; 2) NVM data management; 3) concurrency control; and 4) crash-consistency and replication. The key idea to remove persistency bottleneck is the use of \em operation log that reduces stall time due to RDMA writes and enables efficient batching and caching in front-end nodes. To evaluate performance, we construct eight widely used data structures and two transaction applications based on \anvm framework. In a 10-node cluster equipped with real NVM devices, results show that \anvm achieves similar or better performance compared to the best possible symmetric architecture while enjoying the benefits of disaggregation. We found the speedup brought by the proposed optimizations is drastic, --- 5$\sim$12× among all benchmarks.
What problem does this paper attempt to address?