Abstract:System logs are a valuable source of information for monitoring and maintaining the security and stability of computer systems. Techniques based on Deep Learning and Natural Language Processing have demonstrated effectiveness in detecting abnormal behaviour from these system logs. However, existing anomaly detection approaches have limitations in terms of flexibility and practicality. Techniques that rely on log templates such as DeepLog and LogBERT fail to capture semantic information and are unable to handle variability in log content. On the other hand, classification-based approaches such as LogSy, LogRobust and HitAnomaly require time-consuming data labelling for supervised training. In this paper, a novel log anomaly detection model named LogFiT is proposed. The LogFiT model doesn’t make use of a vocabulary of log templates and it doesn’t require any labeled data as the model only requires self-supervised training. The LogFiT model uses a pretrained Bidirectional Encoder Representations from Transformers (BERT)-based language model fine-tuned to recognise the linguistic patterns of the normal log data. The LogFiT model is trained using masked sentence prediction on the normal log data only. Consequently, when presented with the new log data, the model’s top- ${k}$ token prediction accuracy serves as a threshold for determining whether the new log data deviates from the normal log data. Experimental results show that LogFiT’s F1 score exceeds that of baselines on the HDFS, BGL, and Thunderbird datasets. Critically, when variability is introduced in the log data during evaluation, LogFiT retains its effectiveness compared to that of baselines.

BERT-Log: Anomaly Detection for System Logs Based on Pre-trained Language Model

Log Anomaly Detection method based on BERT model optimization

LogBERT: Log Anomaly Detection via BERT

BertHTLG: Graph-Based Microservice Anomaly Detection Through Sentence-Bert Enhancement.

Research on Log Anomaly Detection Based on Sentence-BERT

Leveraging Large Language Models and BERT for Log Parsing and Anomaly Detection

CLDTLog: System Log Anomaly Detection Method Based on Contrastive Learning and Dual Objective Tasks

SemLog: A Semantics-based Approach for Anomaly Detection in Big Data System Logs

LAnoBERT: System Log Anomaly Detection based on BERT Masked Language Model

LLMeLog: an Approach for Anomaly Detection Based on LLM-enriched Log Events

LogLLM: Log-based Anomaly Detection Using Large Language Models

Robust and Transferable Anomaly Detection in Log Data using Pre-Trained Language Models

LogFiT: Log Anomaly Detection Using Fine-Tuned Language Models

Natural Language Processing-based Model for Log Anomaly Detection

LogST: Log Semi-supervised Anomaly Detection Based on Sentence-BERT

An Anomaly Detection Approach of Part-of-Speech Log Sequence Via Population Based Training

Log-based Anomaly Detection Without Log Parsing

LogBD: A Log Anomaly Detection Method Based on Pretrained Models and Domain Adaptation

Robust Log-Based Anomaly Detection on Unstable Log Data

MLog: Mogrifier LSTM-based Log Anomaly Detection Approach Using Semantic Representation

Improving Log-Based Anomaly Detection by Pre-Training Hierarchical Transformers