Performance Analysis of Anomaly Detection Methods for Application System on Kubernetes with Auto-scaling and Self-healing

Yoichi Matsuo,Daisuke Ikegami
DOI: https://doi.org/10.23919/cnsm52442.2021.9615544
2021-10-25
Abstract:Kubernetes (K8s) is promising software for application systems since it makes application systems more flexible and robust by auto-scaling, which automatically scales up the application system resources when the application system is overloaded, and self-healing, which automatically recovers the application system from a failure. However, auto-scaling and self-healing make system operators' tasks complex. First, there is a delay, which is the time difference between executing auto-scaling or self-healing and recovering degraded application performance metrics such as response time. Second, the delay depends on types of abnormalities (i.e., overloads and failures). Moreover, the auto-scaling and self-healing cannot always recover the abnormality. Therefore, system operators need to understand the degree of abnormality (i.e., how much the application performance is degraded and how long the delay is). Although many anomaly detection methods have been developed, they have not considered auto-scaling or self-healing when the abnormality occurs. In this paper, we analyze the performance of anomaly detection methods with auto-scaling and self-healing in K8s by implementing anomaly detection methods, and deploying a web application system on K8s. Specifically, first, we verified that there is a delay that depends on types of abnormality by injecting anomalies into the web application system. Then, we evaluated the anomaly detection accuracy of each method by using the data collected from the web application. Finally, a clustering approach is used for anomaly scores, which are the outputs of these methods, to investigate whether anomaly detection methods can provide the degree of abnormality. The evaluations show that our analysis provides useful information for operators to manage the K8s with auto-scaling and self-healing.
What problem does this paper attempt to address?