MDFULog: Multi-Feature Deep Fusion of Unstable Log Anomaly Detection Model

Gang Li,Mingle Zhou,Mengjie Sun,Min Li,Delong Han

DOI: https://doi.org/10.3390/app13042237

2023-02-09

Applied Sciences

Abstract:Effective log anomaly detection can help operators locate and solve problems quickly, ensure the rapid recovery of the system, and reduce economic losses. However, recent log anomaly detection studies have shown some drawbacks, such as concept drift, noise problems, and fuzzy feature relation extraction, which cause data instability and abnormal misjudgment, leading to significant performance degradation. This paper proposes a multi-feature deep fusion of an unstable log anomaly detection model (MDFULog) for the above problems. The MDFULog model uses a novel log resolution method to eliminate the dynamic interference caused by noise. This paper proposes a feature enhancement mechanism that fully uses the correlation between semantic information, time information, and sequence features to detect various types of log exceptions. The introduced semantic feature extraction model based on Bert preserves the semantics of log messages and maps them to log vectors, effectively eliminating worker randomness and noise injection caused by log template updates. An Informer anomaly detection classification model is proposed to extract practical information from a global perspective and predict outliers quickly and accurately. Experiments were conducted on HDFS, OpenStack, and unstable datasets, showing that the anomaly detection method in this paper performs significantly better than available algorithms.

Computer Science

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper aims to address several key issues in log anomaly detection: 1. **Concept Drift, Noise Issues, and Ambiguity in Feature Relationship Extraction**: Current log anomaly detection methods suffer from concept drift, noise interference, and unclear feature relationship extraction, leading to data instability and false anomaly detection, significantly reducing performance. 2. **Insufficient Correlation Between Different Features**: Existing anomaly detection methods typically train learning models for different features (such as time anomalies, parameter anomalies, etc.) separately, ignoring the correlation between different features, which affects detection accuracy. 3. **Insufficient Handling of Long Sequence Dependencies**: Current research mainly uses variants of Recurrent Neural Networks (RNNs) such as LSTM to detect anomalies in log data. However, these methods can only obtain historical sequence information and cannot capture complex long sequence dependencies from a global perspective. To address the above issues, the authors propose a multi-feature deep fusion log anomaly detection model (MDFULog). This model eliminates dynamic noise interference through a novel log parsing method and introduces a Bert-based semantic feature extraction model and an Informer-based classification model, effectively improving the accuracy and speed of anomaly detection. Experimental results show that this model significantly outperforms existing algorithms on HDFS, OpenStack, and synthetic unstable datasets.

MDFULog: Multi-Feature Deep Fusion of Unstable Log Anomaly Detection Model

MLog: Mogrifier LSTM-based Log Anomaly Detection Approach Using Semantic Representation

LLMeLog: an Approach for Anomaly Detection Based on LLM-enriched Log Events

AFALog: A General Augmentation Framework for Log-based Anomaly Detection with Active Learning

Distributed system anomaly detection using deep learning‐based log analysis

Natural Language Processing-based Model for Log Anomaly Detection

Deep Learning-based Anomaly Detection and Log Analysis for Computer Networks

Log-based Anomaly Detection with Deep Learning: How Far Are We?

Robust Log-Based Anomaly Detection on Unstable Log Data

Log anomaly detection method based on CNN and LSTM fusion

LogMS: a multi-stage log anomaly detection method based on multi-source information fusion and probability label estimation

SSDLog: a semi-supervised dual branch model for log anomaly detection

Log2graphs: An Unsupervised Framework for Log Anomaly Detection with Efficient Feature Extraction

Load Balancing Based on Process Migration for MPI

LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs

A robust multi-scale feature extraction framework with dual memory module for multivariate time series anomaly detection

MLAD: A Unified Model for Multi-system Log Anomaly Detection

Federated Anomaly Detection on System Logs for the Internet of Things: A Customizable and Communication-Efficient Approach

DCFF-MTAD: A Multivariate Time-Series Anomaly Detection Model Based on Dual-Channel Feature Fusion

Leveraging RAG-Enhanced Large Language Model for Semi-Supervised Log Anomaly Detection

Log-based Anomaly Detection Without Log Parsing