SemLog: A Semantics-based Approach for Anomaly Detection in Big Data System Logs

Wu Chen,Ningning Han,Xiaoman Tan,Dongdong Wang,Siyang Lu
DOI: https://doi.org/10.1109/ICPADS60453.2023.00174
2023-12-17
Abstract:Syslog-based anomaly detection is crucial for protecting the systems from malicious attacks or malfunctions. System logs are semi-structured text messages printed by logging statements to record the system’s run-time status, involving rich semantic information. However, the existing BERT-based log anomaly detection method is based on the log key sequence, does not consider the semantics of the log data, and discards the variable part, resulting in a high rate of missed detection. In this paper, we propose SemLog, a self-supervised framework for log anomaly detection based on BERT. By incorporating log semantics and variables and employing multi-feature fusion, we mitigate the independent assumption issue in the Masked Language Modeling model. The experimental results on three benchmarks show that SemLog achieves high performance compared with the state-of-the-art approaches for anomaly detection.
Computer Science
What problem does this paper attempt to address?