Abstract:We address the problem of under-utilization of resources in datacenters during cloud operations, specifically focusing on the challenge of online virtual machine (VM) scheduling. Rather than following the traditional approach of scheduling VMs based solely on their static flavors, we take into account their dynamic CPU utilization. We employ Gamma -robustness theory to manage the dynamic nature and introduce a novel variant of bin packing -(), which theoretically protects the Physical Machines (PMs) from hotspots formation within a specified probability alpha . We develop a scheduling algroithm named CloseRadiusFit and cold-start AI-based prediction algorithms for the online version of . To verify the quality of our approach towards the optimal solutions, we solve the Offline problem by designing a novel Mixed Integer Linear Programming (MILP) model and a combination of numerical upper and lower bounds. Our experimental results demonstrate that CloseRadiusFit achieves narrow gaps of 1.6% and 3.1% when compared to the lower and upper bounds, respectively. Note to Practitioners -A growing trend in the cloud industry involves overcommitting VMs on PMs. While this approach can ease the problem of low utilization of resources in datacenters, it also introduces a higher risk of hotspots due to resource contention and competition among VMs. In this work, we propose a novel method that leverages Gamma -robustness theory and introduce effective heuristics to achieve ultimate utilization of datacenter resources while ensuring desirable service quality. We validate our approach using real-world production data from Huawei Cloud, improving resource utilization by 125% over traditional flavor-based allocation methods, while maintaining the occurrence of hotspots below 5% ( alpha=0.05 ). Our solution only requires VMs' real utilization data that is typically already collected in cloud providers' production environments. Therefore, with minimal modifications to the existing scheduling system, cloud providers can easily implement our solution and reap its benefits. Moreover, in cases of the absence of historical utilization data for VMs (cold-start), we use machine learning to predict VM utilization statistics for our approach.

PREACT: Predictive Resource Allocation for Bursty Workloads in a Co-located Data Center

Online Cost-Aware Service Requests Scheduling in Hybrid Clouds for Cloud Bursting

AI-oriented Workload Allocation for Cloud-Edge Computing.

PROMPT: Learning Dynamic Resource Allocation Policies for Network Applications

System Resource Utilization Analysis and Prediction for Cloud Based Applications under Bursty Workloads

A Novel Reactive-Predictive Hybrid Resource Provision Method in Cloud Datacenter.

PCS: Predictive Component-level Scheduling for Reducing Tail Latency in Cloud Online Services

RPPS: A Novel Resource Prediction and Provisioning Scheme in Cloud Data Center

Predictive Control for Dynamic Resource Allocation in Enterprise Data Centers

Smart VM Co-Scheduling with the Precise Prediction of Performance Characteristics

Prepartition: Paradigm for the Load Balance of Virtual Machine Allocation in Data Centers

Heterogeneity-aware Proactive Elastic Resource Allocation for Serverless Applications

Prepartition: Load Balancing Approach for Virtual Machine Reservations in a Cloud Data Center

Aggressive Resource Provisioning for Ensuring QoS in Virtualized Environments

PSRPS: A Workload Pattern Sensitive Resource Provisioning Scheme for Cloud Systems

Burstiness-aware Server Consolidation Via Queuing Theory Approach in a Computing Cloud

Resource Allocation with Workload-Time Windows for Cloud-Based Software Services: A Deep Reinforcement Learning Approach

Preemptive Scheduling for Distributed Machine Learning Jobs in Edge-Cloud Networks

Hotspot-Aware Scheduling of Virtual Machines with Overcommitment for Ultimate Utilization in Cloud Datacenters

Hybrid Cloud Adaptive Scheduling Strategy for Heterogeneous Workloads

A Bi-Objective Learn-and-Deploy Scheduling Method for Bursty and Stochastic Requests on Heterogeneous Cloud Servers