Abstract:Log data is a valuable resource for understanding system status. Log recording running status for a computer system is commonly used to identify performance issues and malfunctions. Sequential anomaly detection of logs is crucial for building a secure and stable system and is beneficial for the discovery, location, and analysis of system failures. In this paper, we propose a new log sequential anomaly detection method based on natural language processing techniques by the Population Based Training (PBT) algorithm, which can make full use of semantic information in log templates to analyze log sequences. The Part-of-Speech (PoS) weight mechanism is first employed to improve the digital representation quality of the log template in the feature extraction. And then, TextCNN is used to extract noteworthy information in log template vectors. In the sequence log anomaly detection stage, the combination of TextCNN and LSTM neural network can improve the accuracy of log sequential anomaly detection. On the other hand, the proposed method jointly trains the parameters of the PoS weight mechanism and the parameters of the anomaly detection neural network model through the PBT algorithm, which accelerates the model convergence speed and improves the accuracy of the log sequential anomaly detection. Our model has been tested on four data sets and compared with two state-of-the-art models to prove the effectiveness of our model. The experimental results show that, compared with other log anomaly detection methods, the proposed method performs well.

Detecting log anomaly using subword attention encoder and probabilistic feature selection

Natural Language Processing-based Model for Log Anomaly Detection

Research on System Log Anomaly Detection Combining Two-way Slice GRU and GA-Attention Mechanism

Log-based Anomaly Detection Without Log Parsing

LogPS: A Robust Log Sequential Anomaly Detection Approach Based on Natural Language Processing

An Anomaly Detection Approach of Part-of-Speech Log Sequence Via Population Based Training

LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs

DualAttlog: Context aware dual attention networks for log-based anomaly detection

Assessing the impact of bag‐of‐words versus word‐to‐vector embedding methods and dimension reduction on anomaly detection from log files

MLog: Mogrifier LSTM-based Log Anomaly Detection Approach Using Semantic Representation

ConAnomaly: Content-Based Anomaly Detection for System Logs

Robust Log-Based Anomaly Detection on Unstable Log Data

Network anomaly detection based on keyword embedding log

SemLog: A Semantics-based Approach for Anomaly Detection in Big Data System Logs

Distributed system anomaly detection using deep learning‐based log analysis

LogAttn: Unsupervised Log Anomaly Detection with an AutoEncoder Based Attention Mechanism

LLMeLog: an Approach for Anomaly Detection Based on LLM-enriched Log Events

Experience Report: Log Mining Using Natural Language Processing and Application to Anomaly Detection

End-To-End Anomaly Detection for Identifying Malicious Cyber Behavior through NLP-Based Log Embeddings

Recurrent Neural Network Language Models for Open Vocabulary Event-Level Cyber Anomaly Detection