An Ultra-Low Latency and Compatible PCIe Interconnect for Rack-Scale Communication.
Yibo Huang,Yukai Huang,Ming Yan,Jiayu Hu,Cunming Liang,Yang Xu,Wenxiong Zou,Yiming Zhang,Rui Zhang,Chunpu Huang,Jie Wu
DOI: https://doi.org/10.1145/3555050.3569128
2022-01-01
Abstract:Emerging network-attached resource disaggregation architecture requires ultra-low latency rack-scale communication. However, current hardware offloading (e.g., RDMA) and user-space (e.g., mTCP) communication schemes still rely on heavily layered protocol stacks which requires the translation between PCIe bus and network protocol, or complex connection/memory resource management within RNICs, inevitably bringing latency overhead. We argue that PCIe Non-Transparent Bridge (NTB) is a superior high-speed in-rack network technology to interconnect PCIe-attached machines or devices with the same PCIe fabric since no translation is needed between PCIe and network protocol. We present NTSocks, the first user-space in-rack interconnect over PCIe fabric which virtualizes native NTB into high-level network functionalities for rack-scale systems with software-hardware co-design. NTSocks provides (1) compatibility with a fast socket-like abstraction, (2) multi-thread scalability using a core-driven dat-aplane model, and (3) fair and efficient resource sharing with a multi-tenant isolation mechanism. Even though PCIe NTB is originally designed for device communication across PCIe domains, NTSocks shows a flexible user-level indirection with performance close to bare-metal NTB while providing common network stack features. In the evaluations with latency-sensitive Key-Value Store, NTSocks achieves better latency by up to 24.5× and 1.58× than kernel and RDMA socket, respectively.