Eigen: End-to-End Resource Optimization for Large-Scale Databases on the Cloud

Ji You Li,Jiachi Zhang,Wenchao Zhou,Yuhang Liu,Shuai Zhang,Zhuoming Xue,Ding Xu,Hua Fan,Fangyuan Zhou,Feifei Li
DOI: https://doi.org/10.14778/3611540.3611565
IF: 2.5
2023-08-01
Proceedings of the VLDB Endowment
Abstract:Increasingly, cloud database vendors host large-scale geographically distributed clusters to provide cloud database services. When managing the clusters, we observe that it is challenging to simultaneously maximizing the resource allocation ratio and resource availability. This problem becomes more severe in modern cloud database clusters, where resource allocations occur more frequently and on a greater scale. To improve the resource allocation ratio without hurting resource availability, we introduce Eigen, a large-scale cloud-native cluster management system for large-scale databases on the cloud. Based on a resource flow model, we propose a hierarchical resource management system and three resource optimization algorithms that enable end-to-end resource optimization. Furthermore, we demonstrate the system optimization that promotes user experience by reducing scheduling latencies and improving scheduling throughput. Eigen has been launched in a large-scale public-cloud production environment for 30+ months and served more than 30+ regions (100+ available zones) globally. Based on the evaluation of real-world clusters and simulated experiments, Eigen can improve the allocation ratio by over 27% (from 60% to 87.0%) on average, while the ratio of delayed resource provisions is under 0.1%.
computer science, information systems, theory & methods
What problem does this paper attempt to address?