CausalConvLSTM: Semi-Supervised Log Anomaly Detection Through Sequence Modeling

Steven Yen,Melody Moh,Teng-Sheng Moh
DOI: https://doi.org/10.1109/icmla.2019.00217
2019-12-01
Abstract:Computer systems utilize logging to record events of interest. These logs are a rich source of information, and can be analyzed to detect attacks, failures, and many other issues. Due to the automated generation of logs by computer processes, the volume and throughput of logs can be extremely large, limiting the effectiveness of manual analysis. Rule-based systems were introduced to automatically detect issues based on rules written by experts. However, these systems can only detect known issues for which related rules exist in the rule-set. On the other hand, anomaly detection (AD) approaches can detect unknown issues. This is achieved by looking for unusual behaviors significantly different from the norm. In this paper, we target the problem of semi-supervised log anomaly detection, where the only training data available are normal logs from a baseline period. We propose a novel hybrid model called "CausalConvLSTM" for modeling log sequences that takes advantage of Convolutional Neural Network’s (CNN) ability to efficiently extract spatial features in a parallel fashion, and Long Short-Term Memory (LSTM) network’s superior ability to capture sequential relationships. Another major challenge faced by anomaly detection systems is concept drift, which is the change in normal system behavior over time. We proposed and evaluated concrete strategies for retraining neural-network (NN) anomaly detection systems to adapt to concept drift.
What problem does this paper attempt to address?