Abstract:Kubernetes is an open-source container orchestration system that provides a built-in module for dynamic resource provisioning named the Horizontal Pod Autoscaler (HPA). The HPA identifies the number of resources to be provisioned by calculating the ratio between the current and target utilisation metrics. The target utilisation metric, or threshold, directly impacts how many and how quickly resources will be provisioned. However, the determination of the threshold that would allow satisfying performance-based Service Level Objectives (SLOs) is a long, error-prone, manual process because it is based on the static threshold principle and requires manual configuration. This might result in underprovisioning or overprovisioning, leading to the inadequate allocation of computing resources or SLO violations. Numerous autoscaling solutions have been introduced as alternatives to the HPA to simplify the process. However, the HPA is still the most widely used solution due to its ease of setup, operation, and seamless integration with other Kubernetes functionalities. The present study proposes a method that utilises exploratory data analysis techniques along with moving average smoothing to identify the target utilisation threshold for the HPA. The objective is to ensure that the system functions without exceeding the maximum number of events that result in a violation of the response time defined in the SLO. A prototype was created to adjust the threshold values dynamically, utilising the proposed method. This prototype enables the evaluation and comparison of the proposed method with the HPA, which has the highest threshold set that meets the performance-based SLOs. The results of the experiments proved that the suggested method adjusts the thresholds to the desired service level with a 1–2% accuracy rate and only 4–10% resource overprovisioning, depending on the type of workload.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is: **How to dynamically adjust the target utilization threshold of Kubernetes Horizontal Pod Autoscaler (HPA) to ensure that the system performance meets the Service - Level Objectives (SLO) and minimize the number of Service - Level Agreement (SLA) violations**. Specifically, the existing HPA configurations rely on static thresholds, which lead to the complexity of manual configuration and potential problems of over - allocation or under - allocation of resources. These problems may cause performance degradation or failure to meet the SLO. In addition, the traditional monitoring method based on average response time may not provide sufficient information to predict upcoming SLO violations. To solve these problems, this research proposes a new method, using exploratory data analysis techniques and moving - average smoothing techniques to dynamically adjust the target utilization threshold of HPA. This method aims to ensure that the system can dynamically adjust resource allocation according to actual needs without exceeding the maximum allowed number of SLO violations. ### Main contributions: 1. **Introduced a new method**: Supports the identification of the target utilization threshold of HPA, ensuring that the system performance meets the defined SLO and does not exceed the allowed number of violations. 2. **Implemented a prototype solution**: Named SLA - Adaptive Threshold Adjuster (SATA), for evaluating and testing the proposed threshold detection method. The experimental results show that under different load patterns, the smoothing techniques and the length of the data collection cycle have different impacts on the algorithm efficiency. 3. **Real - environment testing**: Tested using real - world workload traces under various workload conditions, verifying the effectiveness of this method. The experimental results show that this solution enables HPA to manage resources with almost no impact on performance, and the over - allocation of resources is only about 10%. 4. **Emphasized the dynamic adjustment of the threshold**: Even for the same application and the same resource settings, different load patterns require different target utilization values. In conclusion, the purpose of this research is to improve the automatic scaling decision - making process of HPA, enabling it to better meet strict SLO requirements without introducing additional complexity. In this way, users can continue to use HPA while ensuring that the system performance meets the requirements of SLA.

SLA-Adaptive Threshold Adjustment for a Kubernetes Horizontal Pod Autoscaler

Dynamically Adjusting Scale of a Kubernetes Cluster under QoS Guarantee

Microservice Auto-Scaling Algorithm Based on Workload Prediction in Cloud-Edge Collaboration Environment

Horizontal Pod Autoscaling in Kubernetes for Elastic Container Orchestration

Zeus: Improving Resource Efficiency Via Workload Colocation for Massive Kubernetes Clusters

HCA Operator: A Hybrid Cloud Auto-scaling Tooling for Microservice Workloads.

AHPA: Adaptive Horizontal Pod Autoscaling Systems on Alibaba Cloud Container Service for Kubernetes

Smart HPA: A Resource-Efficient Horizontal Pod Auto-scaler for Microservice Architectures

Self-adaptive, Requirements-driven Autoscaling of Microservices

High Concurrency Response Strategy based on Kubernetes Horizontal Pod Autoscaler

Deep Learning-Based Autoscaling Using Bidirectional Long Short-Term Memory for Kubernetes

A Time Series-Based Approach to Elastic Kubernetes Scaling

Hybrid Autoscaling Strategy on Container-Based Cloud Platform

Traffic-Aware Horizontal Pod Autoscaler in Kubernetes-Based Edge Computing Infrastructure

A Trend Detection-Based Auto-Scaling Method for Containers in High-Concurrency Scenarios

Machine Learning-Based Scaling Management for Kubernetes Edge Clusters

OOSP: Opportunistic Optimization Scheme for Pod Deployment Enhanced with Multilayered Sensing

Enhancing Machine Learning-Based Autoscaling for Cloud Resource Orchestration

Intelligent microservices autoscaling module using reinforcement learning

HANSEL: Adaptive Horizontal Scaling of Microservices Using Bi-LSTM