Achelous: Enabling Programmability, Elasticity, and Reliability in Hyperscale Cloud Networks.
Chengkun Wei,Xing Li,Ye Yang,Xiaochong Jiang,Tianyu Xu,Bowen Yang,Taotao Wu,Chao Xu,Yilong Lv,Haifeng Gao,Zhentao Zhang,Zikang Chen,Zeke Wang,Zihui Zhang,Shunmin Zhu,Wenzhi Chen
DOI: https://doi.org/10.1145/3603269.3604859
2023-01-01
Abstract:Cloud computing has witnessed tremendous growth, prompting enterprises to migrate to the cloud for reliable and on-demand computing. Within a single Virtual Private Cloud (VPC), the number of instances (such as VMs, bare metals, and containers) has reached millions, posing challenges related to supporting millions of instances with network location decoupling from the underlying hardware, high elastic performance, and high reliability. However, academic studies have primarily focused on specific issues like high-speed data plane and virtualized routing infrastructure, while existing industrial network technologies fail to adequately address these challenges. In this paper, we report on the design and experience of Achelous , Alibaba Cloud's network virtualization platform. Achelous consists of three key designs to enhance hyperscale VPC: ( i ) a novel hierarchical programming architecture based on the collaborative design of both data plane and control plane; ( ii ) elastic performance strategy and distributed ECMP schemes for seamless scale-up and scale-out, respectively; ( iii ) health check scheme and transparent VM live migration mechanisms that ensure stateful flow continuity during the failover. The evaluation results demonstrate that, Achelous scales to over 1, 500, 000 of VMs with elastic network capacity in a single VPC, and reduces 25× programming time, with 99% updating can be completed within 1 second. For failover, it condenses 22.5× downtime during VM live migration, and ensures 99.99% of applications do not experience stall. More importantly, the experience from three years of operation proves the Achelous 's serviceability, and versatility independent of any specific hardware platforms.