LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection

Hongcheng Guo,Jian Yang,Jiaheng Liu,Jiaqi Bai,Boyang Wang,Zhoujun Li,Tieqiao Zheng,Bo Zhang,Junran peng,Qi Tian

2024-01-09

Abstract:Log anomaly detection is a key component in the field of artificial intelligence for IT operations (AIOps). Considering log data of variant domains, retraining the whole network for unknown domains is inefficient in real industrial scenarios. However, previous deep models merely focused on extracting the semantics of log sequences in the same domain, leading to poor generalization on multi-domain logs. To alleviate this issue, we propose a unified Transformer-based framework for Log anomaly detection (LogFormer) to improve the generalization ability across different domains, where we establish a two-stage process including the pre-training and adapter-based tuning stage. Specifically, our model is first pre-trained on the source domain to obtain shared semantic knowledge of log data. Then, we transfer such knowledge to the target domain via shared parameters. Besides, the Log-Attention module is proposed to supplement the information ignored by the log-paring. The proposed method is evaluated on three public and one real-world datasets. Experimental results on multiple benchmarks demonstrate the effectiveness of our LogFormer with fewer trainable parameters and lower training costs.

Machine Learning,Artificial Intelligence,Software Engineering

What problem does this paper attempt to address?

The paper primarily addresses the issue of cross-domain log anomaly detection in the field of Artificial Intelligence Operations (AIOps). Specifically, the paper tackles the following problems: 1. **Cross-domain adaptability issue**: Existing deep learning models usually perform well only within a specific domain. When encountering unknown or different domains, the entire network needs to be retrained, which is inefficient in real industrial scenarios. 2. **Semantic information loss issue**: Traditional log parsing methods, although capable of handling out-of-vocabulary (OOV) problems, discard the semantic information in the original log data during the parsing process. To address the above challenges, the authors propose a unified framework named **LogFormer**, which consists of two stages: pre-training and adapter fine-tuning. The main contributions of LogFormer include: - Proposing a new pre-training and fine-tuning pipeline for automatic log anomaly detection, with a particular emphasis on simple and effective pre-training and adapter fine-tuning strategies. - Introducing the Log-Attention module to avoid semantic information loss caused by log parsing. - Significantly reducing training costs through an effective parameter sharing strategy, requiring only a small number of additional trainable parameters in the target domain. - Achieving state-of-the-art performance on three public benchmark datasets. In summary, LogFormer aims to capture cross-domain general semantics of log sequences through a pre-trained model and transfer this knowledge to new domains via lightweight adapters, thereby improving model generalization and reducing training costs. Additionally, experiments validate the effectiveness and robustness of LogFormer under resource-constrained conditions, achieving good results in practical applications.

LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection

Natural Language Processing-based Model for Log Anomaly Detection

LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs

HitAnomaly: Hierarchical Transformers for Anomaly Detection in System Log

Deep Learning-based Anomaly Detection and Log Analysis for Computer Networks

An Anomaly Detection Approach of Part-of-Speech Log Sequence Via Population Based Training

What Information Contributes to Log-based Anomaly Detection? Insights from a Configurable Transformer-Based Approach

OneLog: Towards End-to-End Training in Software Log Anomaly Detection

LogOnline: A Semi-Supervised Log-Based Anomaly Detector Aided with Online Learning Mechanism.

Log Sequence Anomaly Detection Based on Local Information Extraction and Globally Sparse Transformer Model

LogPal: A Generic Anomaly Detection Scheme of Heterogeneous Logs for Network Systems

Log-based Anomaly Detection Without Log Parsing

Biglog: Unsupervised Large-scale Pre-training for a Unified Log Representation

LogBD: A Log Anomaly Detection Method Based on Pretrained Models and Domain Adaptation

OneLog: towards end-to-end software log anomaly detection

MetaLog: Generalizable Cross-System Anomaly Detection from Logs with Meta-Learning.

Improving Log-Based Anomaly Detection by Pre-Training Hierarchical Transformers

LLMeLog: an Approach for Anomaly Detection Based on LLM-enriched Log Events

MLog: Mogrifier LSTM-based Log Anomaly Detection Approach Using Semantic Representation

LogPS: A Robust Log Sequential Anomaly Detection Approach Based on Natural Language Processing

LogAttn: Unsupervised Log Anomaly Detection with an AutoEncoder Based Attention Mechanism