MLAD: A Unified Model for Multi-system Log Anomaly Detection

Runqiang Zang,Hongcheng Guo,Jian Yang,Jiaheng Liu,Zhoujun Li,Tieqiao Zheng,Xu Shi,Liangfan Zheng,Bo Zhang
2024-01-15
Abstract:In spite of the rapid advancements in unsupervised log anomaly detection techniques, the current mainstream models still necessitate specific training for individual system datasets, resulting in costly procedures and limited scalability due to dataset size, thereby leading to performance bottlenecks. Furthermore, numerous models lack cognitive reasoning capabilities, posing challenges in direct transferability to similar systems for effective anomaly detection. Additionally, akin to reconstruction networks, these models often encounter the "identical shortcut" predicament, wherein the majority of system logs are classified as normal, erroneously predicting normal classes when confronted with rare anomaly logs due to reconstruction errors.
Software Engineering,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve The paper primarily aims to address several key issues in multi-system log anomaly detection: 1. **Insufficient Model Generalization Capability**: - Current mainstream log anomaly detection models require separate training for each system's dataset, leading to high costs and poor scalability due to dataset size limitations. 2. **Lack of Cognitive Reasoning Ability**: - Most existing models lack cognitive reasoning ability, making it difficult for them to be directly applied to effective anomaly detection in similar systems. 3. **"Same Shortcut" Problem**: - These models often encounter the "same shortcut" problem, where most system logs are incorrectly classified as normal. When faced with rare anomalous logs, prediction errors occur due to reconstruction errors. To address these issues, the authors propose a new model called MLAD (Multi-system Log Anomaly Detection), which achieves the following: - Uses Sentence-BERT to capture the similarity between different system log sequences and convert them into high-dimensional learnable semantic vectors. - Improves the formula of the attention layer to identify the importance of each keyword in the sequence and models the distribution of the entire multi-system dataset through appropriate vector space diffusion. - Utilizes a Gaussian Mixture Model (GMM) to highlight the uncertainty of rare words, optimizing the vector space of samples to solve the "same shortcut" problem. Experimental results show that MLAD performs superiorly on three real-world datasets, outperforming previous models.