MLAD: A Unified Model for Multi-system Log Anomaly Detection

Runqiang Zang,Hongcheng Guo,Jian Yang,Jiaheng Liu,Zhoujun Li,Tieqiao Zheng,Xu Shi,Liangfan Zheng,Bo Zhang

2024-01-15

Abstract:In spite of the rapid advancements in unsupervised log anomaly detection techniques, the current mainstream models still necessitate specific training for individual system datasets, resulting in costly procedures and limited scalability due to dataset size, thereby leading to performance bottlenecks. Furthermore, numerous models lack cognitive reasoning capabilities, posing challenges in direct transferability to similar systems for effective anomaly detection. Additionally, akin to reconstruction networks, these models often encounter the "identical shortcut" predicament, wherein the majority of system logs are classified as normal, erroneously predicting normal classes when confronted with rare anomaly logs due to reconstruction errors.

Software Engineering,Artificial Intelligence,Machine Learning

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve The paper primarily aims to address several key issues in multi-system log anomaly detection: 1. **Insufficient Model Generalization Capability**: - Current mainstream log anomaly detection models require separate training for each system's dataset, leading to high costs and poor scalability due to dataset size limitations. 2. **Lack of Cognitive Reasoning Ability**: - Most existing models lack cognitive reasoning ability, making it difficult for them to be directly applied to effective anomaly detection in similar systems. 3. **"Same Shortcut" Problem**: - These models often encounter the "same shortcut" problem, where most system logs are incorrectly classified as normal. When faced with rare anomalous logs, prediction errors occur due to reconstruction errors. To address these issues, the authors propose a new model called MLAD (Multi-system Log Anomaly Detection), which achieves the following: - Uses Sentence-BERT to capture the similarity between different system log sequences and convert them into high-dimensional learnable semantic vectors. - Improves the formula of the attention layer to identify the importance of each keyword in the sequence and models the distribution of the entire multi-system dataset through appropriate vector space diffusion. - Utilizes a Gaussian Mixture Model (GMM) to highlight the uncertainty of rare words, optimizing the vector space of samples to solve the "same shortcut" problem. Experimental results show that MLAD performs superiorly on three real-world datasets, outperforming previous models.

MLAD: A Unified Model for Multi-system Log Anomaly Detection

A Novel System Anomaly Prediction System Based on Belief Markov Model and Ensemble Classification

A Unified Model for Multi-class Anomaly Detection

Natural Language Processing-based Model for Log Anomaly Detection

End-to-End AutoML for Unsupervised Log Anomaly Detection

A LSTM-Based Anomaly Detection Model for Log Analysis

LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs

DMAD: Dual Memory Bank for Real-World Anomaly Detection

MetaLog: Generalizable Cross-System Anomaly Detection from Logs with Meta-Learning.

Toward Multi-class Anomaly Detection: Exploring Class-aware Unified Model against Inter-class Interference

MDFULog: Multi-Feature Deep Fusion of Unstable Log Anomaly Detection Model

LogMS: a multi-stage log anomaly detection method based on multi-source information fusion and probability label estimation

CausalConvLSTM: Semi-Supervised Log Anomaly Detection Through Sequence Modeling

Learning Unified Reference Representation for Unsupervised Multi-class Anomaly Detection

Log-based Anomaly Detection with Deep Learning: How Far Are We?

MLAD: Manifest and Latent Anomaly Detection based on the Integration of Reconstruction and MLFP-KNN Methods

LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection

Develop End-to-End Anomaly Detection System

Deep Learning-based Anomaly Detection and Log Analysis for Computer Networks

OMLog: Online Log Anomaly Detection for Evolving System with Meta-learning

SSDLog: a semi-supervised dual branch model for log anomaly detection