Time Series Predictions for Cloud Workloads: A Comprehensive Evaluation
Anna Lackinger,Andrea Morichetta,Schahram Dustdar
DOI: https://doi.org/10.1109/sose62363.2024.00011
2024-01-01
Abstract:Predicting workloads in cloud computing is a significant challenge due to their complex, multidimensional, and highly variable nature. Assessing the accuracy of these predictions is critical, as this directly impacts management decisions that the infrastructure manager has to take in real-time to optimize resource utilization and meet Service Level Agreements (SLAs). Researchers and practitioners approached workload prediction as a time series problem using both statistical and Machine Learning (ML) methods. However, due to the volatile nature of the resource utilization patterns and the fact that new workloads constantly appear, developing robust solutions is still an open challenge. Furthermore, current solutions often only predict one single workload, completely lacking the capability for generalizing over new workload time series. These approaches fully counter the advantages of leveraging complex methods, as for every new workload, they need to train, validate, and test a separate model. In this paper, we offer a generalizable approach based on Transfer Learning concepts. We comprehensively evaluate different methods (statistical, Machine Learning, and Deep Learning based) for predicting the resource utilization of cloud workloads by testing them on new, unobserved time series data, thereby assessing their potential for practical applications through Transfer Learning. Specifically, we inspect the algorithms’ performance in predicting one or multiple time-stamps ahead, considering both CPU and memory usage. Our main findings indicate that the Deep Learning methods Long Short-Term Memory (LSTM) and Transformer are the most suitable methods for predicting different metrics and timestamps ahead for test data. Through Transfer Learning principles, we investigate how the models’ performance varies with out-of-distribution data. In predicting new, unseen workloads, complex models show some limits, even if LSTM still proves to work in specific cases. Overall, our research offers valuable insights that can help infrastructure managers make correct design decisions when predicting cloud workload resource usage.