Network Load Balancing with In-network Reordering Support for RDMA

Cha Hwan Song,Xin Zhe Khooi,Raj Joshi,Inho Choi,Jialin Li,Mun Choon Chan
DOI: https://doi.org/10.1145/3603269.3604849
2023-09-01
Abstract:Remote Direct Memory Access (RDMA) is widely used in high-performance computing (HPC) and data center networks. In this paper, we first show that RDMA does not work well with existing load balancing algorithms because of its traffic flow characteristics and assumption of in-order packet delivery. We then propose ConWeave, a load balancing framework designed for RDMA. The key idea of ConWeave is that with the right design, it is possible to perform fine granularity rerouting and mask the effect of out-of-order packet arrivals transparently in the network datapath using a programmable switch. We have implemented ConWeave on a Tofino2 switch. Evaluations show that ConWeave can achieve up to 42.3% and 66.8% improvement for average and 99-percentile FCT, respectively compared to the state-of-the-art load balancing algorithms.
What problem does this paper attempt to address?