Abstract:We confront two challenges in the management of a vast and diverse array of online web applications deployed on enterprise-grade auto-scaling infrastructure, primarily focused on ensuring Quality of Service (QoS) for large-scale applications and optimizing resource costs. Firstly, reacting to increased load with a response-based approach can temporarily degrade QoS because many web applications need a few minutes to warm up. Therefore, precise workload prediction is critical for predictive scaling. However, our analysis of real-world applications underscores the substantial challenges arising from the limited precision and robustness of existing single prediction algorithms in the context of predictive auto-scaling. Secondly, guaranteeing the QoS of online applications within a cost-effective structure is crucial, as it is inherently linked to corporate profitability. Nevertheless, our study shows that mainstream auto-scaling methods exhibit various limitations, either being unsuitable for online environments or inadequately ensuring QoS. To address these issues, we introduce PASS, a Predictive Auto-Scaling System tailored for large-scale online web applications in enterprise settings. Our highly robust and accurate prediction framework dynamically integrates and calibrates appropriate prediction algorithms based on the unique characteristics of each application to effectively manage workload diversity. We further establish a performance model derived from online historical logs, enhancing auto-scaling to ensure diverse QoS without adverse impacts on online applications. Additionally, we implement a reactive strategy grounded in queuing theory to promptly address QoS violations resulting from inaccurate predictions or unexpected events. Across a wide spectrum of applications and real-world workloads, PASS outperforms state-of-the-art methods, achieving higher workload prediction accuracy and a superior QoS guarantee rate with less resource cost.

MagicScaler: Uncertainty-aware, Predictive Autoscaling.

Microservice Auto-Scaling Algorithm Based on Workload Prediction in Cloud-Edge Collaboration Environment

A Comparison of Machine Learning Algorithms for Automatic Cloud Resource Scaling on a Multi-Tenant Platform

Robust Auto-Scaling with Probabilistic Workload Forecasting for Cloud Databases

OptScaler: A Hybrid Proactive-Reactive Framework for Robust Autoscaling in the Cloud

A Predictive Autoscaler for Elastic Batch Jobs

DCScaler: Spatiotemporal Prediction Aided Distributed Collaborative Autoscaling of Microservices

A Meta Reinforcement Learning Approach for Predictive Autoscaling in the Cloud

MultiScaler: A Multi-Loop Auto-Scaling Approach for Cloud-Based Applications

PASS: Predictive Auto-Scaling System for Large-scale Enterprise Web Applications

A cost-aware auto-scaling approach using the workload prediction in service clouds

MarVeLScaler : A Multi-View Learning-Based Auto-Scaling System for MapReduce

Load Prediction-Based Automatic Scaling Cloud Computing

Auto-Scaling Provision Basing on Workload Prediction in the Virtualized Data Center

HybridScaler: Handling Bursting Workload for Multi-tier Web Applications in Cloud.

ProScale: Proactive Autoscaling for Microservice with Time-Varying Workload at the Edge.

DeepScaler: Holistic Autoscaling for Microservices Based on Spatiotemporal GNN with Adaptive Graph Learning

A Cost-Aware Auto-Scaling Approach Based on Workload Predictions in Service Clouds

A Cost-Driven Online Auto-Scaling Algorithm for Web Applications in Cloud Environments

Agile auto scaling for supporting large scale cloud service platform

DeepScaling