Abstract:Emerging trends in Cloud computing bring numerous benefits, such as higher performance, fast and flexible provisioning of applications and capacities, lower infrastructure costs, and almost unlimited scalability. However, the increasing complexity of automated performance and resource management for applications in Cloud computing presents novel challenges that demand enhancement to classical control-based approaches.An important challenge that Cloud service providers often face is a resource sharing dilemma under workload variation. Cloud service providers pursue higher resource utilization, because the higher the utilization, the lower the hardware cost, operating cost and maintenance cost. On the other hand, resource utilizations cannot be too high or the service provider's revenue could be jeopardized due to the inability to meet application-level service-level objectives (SLOs). A crucial research question is how to generate as much revenue as possible by satisfying service-level agreements while reducing costs as much as possible in order to maximize the profit for Cloud service providers. To this end, the classical control-based approaches show great potential to address the resource sharing dilemma, which could be classified into three major categories, i.e., admission control, queueing and scheduling, and resource allocation. However, it is a challenging task to apply classical control-based approaches directly to computer systems, where first-principle models are generally not available. It becomes even more difficult due to the dynamics seen in real computer systems including workload variations, multi-tier dependencies, and resource bottleneck shifts.Fundamentally, the main contributions of this thesis are the efforts to enhance classical control-based approaches by leveraging other techniques to address the increasing complexity of automated performance and resource management in the Cloud through dynamic monitoring, modeling and management of performance and resources. More specifically, (1) an admission control approach is enhanced by leveraging decision theory to achieve the most profitable service-level compliance; (2) a critical resource identification approach is enhanced by leveraging statistical machine learning to automatically and adaptively identify critical resources; and (3) a resource allocation approach is enhanced by leveraging hierarchical resource management to achieve the highest resource utilization.Concretely, the enhanced control-based approaches are implemented in a collection of real control systems: ActiveSLA, vPerfGuard and ERController. The control systems are applied to different real applications, such as OLTP and OLAP database applications and distributed multi-tier web applications, with different workload intensities, type and mix, in different Cloud environments. All the experimental results show that the prototype control systems outperform existing classical control-based approaches.Finally, this thesis opens new avenues to address the increasing complexity of automated performance and resource management through enhancement of classical control-based approaches in Cloud environments. Future work will consistently follow the direction of new avenues to address the new challenges that arise with the advent of new hardware technology, new software frameworks and new computing paradigms.

Autothrottle: A Practical Bi-Level Approach to Resource Management for SLO-Targeted Microservices

Microservice Auto-Scaling Algorithm Based on Workload Prediction in Cloud-Edge Collaboration Environment

Razor: Scaling Backend Capacity for Mobile Applications

Dynamic Monitoring, Modeling and Management of Performance and Resources for Applications in the Cloud

MultiScaler: A Multi-Loop Auto-Scaling Approach for Cloud-Based Applications

Arcus: SLO Management for Accelerators in the Cloud with Traffic Shaping

Graph Neural Network-Based SLO-Aware Proactive Resource Autoscaling Framework for Microservices

LSRAM: A Lightweight Autoscaling and SLO Resource Allocation Framework for Microservices Based on Gradient Descent

PBScaler: A Bottleneck-aware Autoscaling Framework for Microservice-based Applications

Automated Fine-Grained CPU Provisioning for Virtual Machines

Control Strategies for Adaptive Resource Allocation in Cloud Computing

Topology-Aware Scheduling Framework for Microservice Applications in Cloud

Retrospecting Available CPU Resources: SMT-Aware Scheduling to Prevent SLA Violations in Data Centers

Going Fast and Fair: Latency Optimization for Cloud-Based Service Chains.

A Heuristic Approach for Scalability of Multi-tiers Web Application in Clouds

Performance-Cost Trade-Off in Auto-Scaling Mechanisms for Cloud Computing

A Hierarchical Receding Horizon Algorithm for QoS-Driven Control of Multi-IaaS Applications

Self-Aware and Self-Adaptive Autoscaling for Cloud Based Services

Enhancing Machine Learning-Based Autoscaling for Cloud Resource Orchestration

Dynamically Balancing Load with Overload Control for Microservices

PROMPT: Learning Dynamic Resource Allocation Policies for Network Applications