LogCSS: Log anomaly detection based on BERT-CNN with context-semantics-statistics features

Zhongliang Li,Xuezhen Tu,Hong Gao,Shiyue Huang,Zongmin Ma
DOI: https://doi.org/10.3233/jifs-235801
2024-02-07
Journal of Intelligent & Fuzzy Systems
Abstract:With the development of artificial intelligence, deep-learning-based log anomaly detection proves to be an important research topic. In this paper, we propose LogCSS, a novel log anomaly detection framework based on the Context-Semantics-Statistics Convolutional Neural Network (CSSCNN). It is the first model that uses BERT (Bidirectional Encoder Representation from Transformers) and CNN (Convolutional Neural Network) to extract the semantic, temporal, and correlational features of the logs. We combine the features with the statistic information of log templates for the classification model to improve the accuracy. We also propose a technique, DOOT (Deals with the Out-Of-Templates), for online template matching. The experimental research shows that our framework improves the average F1 score of the six best algorithms in the industry by more than 5% on the open-source dataset HDFS, and improves the average F1 score of the six best algorithms in the industry by more than 8% on the BGL dataset, LogCSS also performs better than other similar methods on our own constructed dataset.
What problem does this paper attempt to address?