Zeta: A Scalable and Robust East-West Communication Framework in Large-Scale Clouds

Qianyu Zhang,Gongming Zhao,Hongli Xu,Zhuolong Yu,Liguang Xie,Yangming Zhao,Chunming Qiao,Ying Xiong,Liusheng Huang
2022-01-01
Abstract:With the broad deployment of distributed applications on clouds, the dominant volume of traffic in cloud networks traverses in an east-west direction, flowing from server to server within a data center. Existing communication solutions are tightly coupled with either the control plane (e.g., preprogrammed model) or the location of compute nodes (e.g., conventional gateway model). The tight coupling makes it challenging to adapt to rapid network expansion, respond to network anomalies (e.g., burst traffic and device failures), and maintain low latency for east-west traffic. To address this issue, we design Zeta, a scalable and robust east-west communication framework with gateway clusters in large-scale clouds. Zeta abstracts the traffic forwarding capability as a Gateway Cluster Layer, decoupled from the logic of control plane and the location of compute nodes. Specifically, Zeta adopts gateway clusters to support large-scale networks and cope with burst traffic. Moreover, a transparent Multi IPs Migration is proposed to quickly recover the system/devices from unpredictable failures. We implement Zeta based on eXpress Data Path (XDP) and evaluate its scalability and robustness through comprehensive experiments with up to 100k container instances. Our evaluation shows that Zeta reduces the 99% RTT by 5.1x in burst video traffic, and speeds up the gateway recovery by 10.8 x compared with the state-of-the-art solutions.
What problem does this paper attempt to address?