Performance Characterization of SmartNIC NVMe-over-Fabrics Target Offloading
Jiexiong Xu,Yue Qiu,Yiquan Chen,Yijing Wang,Wenhai Lin,Yiquan Lin,Shushu Zhao,Yuqi Liu,Ying Wang,Wenzhi Chen
DOI: https://doi.org/10.1145/3688351.3689154
2024-01-01
Abstract:The NVMe-over-Fabrics (NVMe-oF) is gaining popularity in cloud data centers as a remote storage protocol for accessing NVMe storage devices across servers. With the rapid increase in throughput of the NVMe storage devices, the NVMe-oF stack consumes a significant amount of valuable CPU resources. To release these valuable computing power to other tasks, many smartNICs now support NVMe-oF Target offloading. However, this emerging NVMe-oF offloading scheme's performance has not been fully investigated. In this work, we conducted comprehensive evaluations on smartNIC NVMe-oF target offloading and non-offloading NVMe-oF with TCP and RDMA. Our results demonstrate that the target offloading can achieve comparable throughput compared to NVMe/RDMA in most synthetic and real-world application workloads while releasing up to 38.7% CPU resources. Additionally, target offloading benefits from host bypass and peer-to-peer communication between smartNIC and NVMe SSD, resulting in only 68.9% to 95.8% average latency of NVMe/RDMA. However, we also found that target offloading schemes lack flexibility for specific storage device models, leading to abnormal performance degradation. Finally, we provide insight for system architects to select the appropriate scheme based on the workload and storage device features to fully leverage the advantages of target offloading.