Towards predictable performance via two-layer bandwidth allocation in cloud datacenter.
Hui Yu,Jiahai Yang,Hui Wang,Hui Zhang
DOI: https://doi.org/10.1016/j.jpdc.2018.11.013
IF: 4.542
2019-01-01
Journal of Parallel and Distributed Computing
Abstract:In today’s production-grade cloud datacenters, cloud service providers do not offer any bandwidth guarantee between VMs, which results in unpredictable performance of tenants’ applications. The research community has recognized this problem; however, existing solutions to bandwidth allocation fail to take into consideration tenants’ request for bandwidth and the actual bandwidth usage of applications simultaneously, which leads to a waste of bandwidth resources or unpredictable performance. To address these issues, we present SpongeNet, a bandwidth allocation solution that consists of three components through two layers—static bandwidth guarantees at the tenant layer and a dynamic rate allocation at the application layer to realize predictable performance. The first component, named FGVC model, is a network abstraction model that provides a simple, accurate and flexible way for tenants to specify network requirements and achieve high utilization through bandwidth saving. The second component is a two-phase VM placement algorithm that provides optimal combinations of ordering policies and dispatching policies to meet multiple goals. The third component, named E–F runtime mechanism, can achieve the fairness between guaranteed and unguaranteed tenants in utilizing the unused bandwidth resources. Extensive simulations based on real application traces and 3-level tree topology show that SpongeNet enhances bandwidth saving when compared to the state-of-the-art solutions (e.g., the Oktopus system), and significantly improves the throughput rate by 18% and response time by 92%. With a small prototype implementation on a 7-server testbed, we demonstrate that SpongeNet provides fair work-conserving bandwidth guarantee among all tenants, even in extreme cases.