MonitorAssistant: Simplifying Cloud Service Monitoring Via Large Language Models
Zhaoyang Yu,Minghua Ma,Chaoyun Zhang,Si Qin,Yu Kang,Chetan Bansal,Saravan Rajmohan,Yingnong Dang,Changhua Pei,Dan Pei,Qingwei Lin,Dongmei Zhang
DOI: https://doi.org/10.1145/3663529.3663826
2024-01-01
Abstract:In large-scale cloud service systems, monitoring metric data and conducting anomaly detection is an important way to maintain reliability and stability. However, great disparity exists between academic approaches and industrial practice to anomaly detection. Industry predominantly uses simple, effcient methods due to better interpretability and ease of implementation. In contrast, academically favor deep-learning methods, despite their advanced capabilities, face practical challenges in real-world applications. To address these challenges, this paper introduces MonitorAssistant, an end-to-end practical anomaly detection system via Large Language Models. MonitorAssistant automates model configuration recommendation achieving knowledge inheritance and alarm interpretation with guidance-oriented anomaly reports, facilitating a more intuitive engineer-system interaction through natural language. By deploying MonitorAssistant in Microsoft's cloud service system, we validate its effcacy and practicality, marking a significant advancement in the field of practical anomaly detection for large-scale cloud services.