Lovelock: Towards Smart NIC-hosted Clusters

Seo Jin Park,Ramesh Govindan,Kai Shen,David Culler,Fatma Özcan,Geon-Woo Kim,Hank Levy
2023-09-22
Abstract:Traditional cluster designs were originally server-centric, and have evolved recently to support hardware acceleration and storage disaggregation. In applications that leverage acceleration, the server CPU performs the role of orchestrating computation and data movement and data-intensive applications stress the memory bandwidth. Applications that leverage disaggregation can be adversely affected by the increased PCIe and network bandwidth resulting from disaggregation. In this paper, we advocate for a specialized cluster design for important data intensive applications, such as analytics, query processing and ML training. This design, Lovelock, replaces each server in a cluster with one or more headless smart NICs. Because smart NICs are significantly cheaper than servers on bandwidth, the resulting cluster can run these applications without adversely impacting performance, while obtaining cost and energy savings.
Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?