An Automatic Model Management System and Its Implementation for AIOps on Microservice Platforms

Ruibo Chen,Yanjun Pu,Bowen Shi,Wenjun Wu
DOI: https://doi.org/10.1007/s11227-023-05123-4
IF: 3.3
2023-01-01
The Journal of Supercomputing
Abstract:With the gradual expansion of microservice architecture-based applications, the complexity of system operation and maintenance is also growing significantly. With the advent of AIOps, it is now possible to automatically detect the state of the system, allocate resources, warn, and detect anomalies using machine learning models. Given the dynamic nature of online workloads, the running state of a microservice system in production is constantly in flux. Therefore, it is necessary to continuously train, encapsulate, and deploy models based on the current system status for the AIOps model to dynamically adapt to the system environment. This paper proposes a model update and management pipeline framework for AIOps models in microservices systems in order to accomplish the aforementioned objectives and simplify the process. In addition, a prototype system based on Kubernetes and Gitlab is designed to provide preliminary framework implementation and validation. The system consists of three components: model training, model packaging, and model deploying. Parallelization and parameter search are incorporated into the model training procedure in order to facilitate rapid training of multiple models and automated model hyperparameter tuning. We automate the packaging and deployment process using technology for continuous integration. Experiments are conducted to validate the prototype system, and the results demonstrate the feasibility of the proposed framework. This work serves as a useful resource for constructing an integrated and streamlined AIOps model management system.
What problem does this paper attempt to address?