MDFULog: Multi-Feature Deep Fusion of Unstable Log Anomaly Detection Model

Gang Li,Mingle Zhou,Mengjie Sun,Min Li,Delong Han
DOI: https://doi.org/10.3390/app13042237
2023-02-09
Applied Sciences
Abstract:Effective log anomaly detection can help operators locate and solve problems quickly, ensure the rapid recovery of the system, and reduce economic losses. However, recent log anomaly detection studies have shown some drawbacks, such as concept drift, noise problems, and fuzzy feature relation extraction, which cause data instability and abnormal misjudgment, leading to significant performance degradation. This paper proposes a multi-feature deep fusion of an unstable log anomaly detection model (MDFULog) for the above problems. The MDFULog model uses a novel log resolution method to eliminate the dynamic interference caused by noise. This paper proposes a feature enhancement mechanism that fully uses the correlation between semantic information, time information, and sequence features to detect various types of log exceptions. The introduced semantic feature extraction model based on Bert preserves the semantics of log messages and maps them to log vectors, effectively eliminating worker randomness and noise injection caused by log template updates. An Informer anomaly detection classification model is proposed to extract practical information from a global perspective and predict outliers quickly and accurately. Experiments were conducted on HDFS, OpenStack, and unstable datasets, showing that the anomaly detection method in this paper performs significantly better than available algorithms.
Computer Science
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address several key issues in log anomaly detection: 1. **Concept Drift, Noise Issues, and Ambiguity in Feature Relationship Extraction**: Current log anomaly detection methods suffer from concept drift, noise interference, and unclear feature relationship extraction, leading to data instability and false anomaly detection, significantly reducing performance. 2. **Insufficient Correlation Between Different Features**: Existing anomaly detection methods typically train learning models for different features (such as time anomalies, parameter anomalies, etc.) separately, ignoring the correlation between different features, which affects detection accuracy. 3. **Insufficient Handling of Long Sequence Dependencies**: Current research mainly uses variants of Recurrent Neural Networks (RNNs) such as LSTM to detect anomalies in log data. However, these methods can only obtain historical sequence information and cannot capture complex long sequence dependencies from a global perspective. To address the above issues, the authors propose a multi-feature deep fusion log anomaly detection model (MDFULog). This model eliminates dynamic noise interference through a novel log parsing method and introduces a Bert-based semantic feature extraction model and an Informer-based classification model, effectively improving the accuracy and speed of anomaly detection. Experimental results show that this model significantly outperforms existing algorithms on HDFS, OpenStack, and synthetic unstable datasets.