CloudPlanner: Minimizing Upgrade Risk of Virtual Network Devices for Large-Scale Cloud Networks
Xin He,Enhuan Dong,Jiahai Yang,Shize Zhang,Zhiliang Wang,Zejie Wang,Ye Yang,Jun Zhou,Xiaoqing Sun,Enge Song,Jianyuan Lu,Biao Lyu,Shunmin Zhu
DOI: https://doi.org/10.1109/infocom52122.2024.10621109
2024-01-01
Abstract:Cloud networks continuously upgrade softwarized virtual network devices (VNDs) to meet evolving tenant demands. However, such upgrades may result in unexpected failures. An intuitive idea to prevent upgrade failures is to resolve all compatibility issues before deployment, but it is impractical to replicate all deployed VND cases and test them with lots of replayed real traffic for the VND developers. As a result, the operations team takes upgrade risk to test upgrades by gradually deploying them. Although careful upgrade schedule planning is the most common method to minimize upgrade risk, to the best of our knowledge, no VND upgrade schedule planning scheme has been adequately studied for large-scale cloud networks. To fill this gap, we propose CloudPlanner, the first VND upgrade schedule planning scheme aiming to minimize the VND upgrade risk for large-scale cloud networks. CloudPlanner prioritizes upgrading VNDs that are more likely to trigger failures based on expert knowledge and historical failure-trigger VND properties and limits the number of tenants associated with simultaneously upgraded VNDs. We also propose a heuristic solver which can quickly and greedily plan schedules. Using real-world data from production environments, we demonstrate the benefits of CloudPlanner through extensive experiments.
What problem does this paper attempt to address?