Exploring the Benefits of Resource Disaggregation for Service Reliability in Data Centers

Chao Guo,Xinyu Wang,Gangxiang Shen,Jiahe Xu,Moshe Zukerman,Sanjay Bose
DOI: https://doi.org/10.1109/tcc.2022.3151923
IF: 5.697
2022-01-01
IEEE Transactions on Cloud Computing
Abstract:By overcoming the “server box” barrier, resource disaggregation in data centers (DCs) can significantly improve resource utilization. This may then provide a more cost-efficient approach for resource upgrade and expansion. The advantages of resource disaggregation have been explored in earlier research to improve the efficiency of resource usage. This paper investigates the potential benefits of resource disaggregation from the aspect of reliability, which has not been considered before. Resource disaggregation gives rise to a new failure pattern. For example, in a conventional server, the failure of one type of resource leads to the failure of the entire server, so that other types of resources in the same server also become unavailable. After disaggregating, the failure of different types of resources becomes more isolated so that other resources are still available. In this paper, we model the reliability of a resource allocation request in a server-based or disaggregated DC based on whether the request is allocated with only working resources or is also provisioned with backup resources. We then consider a resource allocation problem to maximize the number of requests accepted with guaranteed reliability. This is formulated as an integer linear programming (ILP) problem, and a more straightforward heuristic approach is also proposed. Our numerical studies demonstrate that it may be possible to significantly improve service reliability with this resource disaggregation approach.
computer science, information systems, theory & methods
What problem does this paper attempt to address?