Abstract:Recently, microservices have become a commonly-used architectural pattern for building cloud-native applications. Cloud computing provides flexibility for service providers, allowing them to remove or add resources depending on the workload of their web applications. If the resources allocated to the service are not aligned with its requirements, instances of failure or delayed response will increase, resulting in customer dissatisfaction. This problem has become a significant challenge in microservices-based applications, because thousands of microservices in the system may have complex interactions. Auto-scaling is a feature of cloud computing that enables resource scalability on demand, thus allowing service providers to deliver resources to their applications without human intervention under a dynamic workload to minimize resource cost and latency while maintaining the quality of service requirements. In this research, we aimed to establish a computational model for analyzing the workload of all microservices. To this end, the overall workload entering the system was considered, and the relationships and function calls between microservices were taken into account, because in a large-scale application with thousands of microservices, accurately monitoring all microservices and gathering precise performance metrics are usually difficult. Then, we developed a multi-criteria decision-making method to select the candidate microservices for scaling. We have tested the proposed approach with three datasets. The results of the conducted experiments show that the detection of input load toward microservices is performed with an average accuracy of about 99% which is a notable result. Furthermore, the proposed approach has demonstrated a substantial enhancement in resource utilization, achieving an average improvement of 40.74%, 20.28%, and 28.85% across three distinct datasets in comparison to existing methods. This is achieved by a notable reduction in the number of scaling operations, reducing the count by 54.40%, 55.52%, and 69.82%, respectively. Consequently, this optimization translates into a decrease in required resources, leading to cost reductions of 1.64%, 1.89%, and 1.67% respectively.

Graph Neural Network-Based SLO-Aware Proactive Resource Autoscaling Framework for Microservices

Microservice Auto-Scaling Algorithm Based on Workload Prediction in Cloud-Edge Collaboration Environment

Razor: Scaling Backend Capacity for Mobile Applications

MSARS: A Meta-Learning and Reinforcement Learning Framework for SLO Resource Allocation and Adaptive Scaling for Microservices

LSRAM: A Lightweight Autoscaling and SLO Resource Allocation Framework for Microservices Based on Gradient Descent

DeepScaler: Holistic Autoscaling for Microservices Based on Spatiotemporal GNN with Adaptive Graph Learning

ProScale: Proactive Autoscaling for Microservice with Time-Varying Workload at the Edge.

Autothrottle: A Practical Bi-Level Approach to Resource Management for SLO-Targeted Microservices

Deep Learning-Based Autoscaling Using Bidirectional Long Short-Term Memory for Kubernetes

ScalAna: Automating Scaling Loss Detection with Graph Analysis

Self-adaptive, Requirements-driven Autoscaling of Microservices

SRAF: A Service-Aware Resource Allocation Framework for VM Management in Mobile Data Networks

OptScaler: A Hybrid Proactive-Reactive Framework for Robust Autoscaling in the Cloud

An Auto-Scaling Approach for Microservices in Cloud Computing Environments

StatuScale: Status-aware and Elastic Scaling Strategy for Microservice Applications

Cdascaler: a cost-effective dynamic autoscaling approach for containerized microservices

DRPC: Distributed Reinforcement Learning Approach for Scalable Resource Provisioning in Container-based Clusters

A Performance Modelling Approach for SLA-Aware Resource Recommendation in Cloud Native Network Functions

A Graph Neural Networks based Framework for Topology-Aware Proactive SLA Management in a Latency Critical NFV Application Use-case

On the Analysis of Inter-Relationship between Auto-Scaling Policy and QoS of FaaS Workloads

DRS: A deep reinforcement learning enhanced Kubernetes scheduler for microservice‐based system