KRCORE: a microsecond-scale RDMA control plane for elastic computing

Xingda Wei,Fangming Lu,Rong Chen,Haibo Chen
DOI: https://doi.org/10.48550/arXiv.2201.11578
2022-06-09
Abstract:We present KRCORE, an RDMA library with a microsecond-scale control plane on commodity RDMA hardware for elastic computing. KRCORE can establish a full-fledged RDMA connection within 10{\mu}s (hundreds or thousands of times faster than verbs), while only maintaining a (small) fixed-sized connection metadata at each node, regardless of the cluster scale. The key ideas include virtualizing pre-initialized kernel-space RDMA connections instead of creating one from scratch, and retrofitting advanced RDMA dynamic connected transport with static transport for both low connection overhead and high networking speed. Under load spikes, KRCORE can shorten the worker bootstrap time of an existing disaggregated key-value store (namely RACE Hashing) by 83%. In serverless computing (namely Fn), KRCORE can also reduce the latency for transferring data through RDMA by 99%.
Networking and Internet Architecture
What problem does this paper attempt to address?