End-to-End AutoML for Unsupervised Log Anomaly Detection

Shenglin Zhang,Yuhe Ji,Jiaqi Luan,Xiaohui Nie,Zi`ang Chen,Minghua Ma,Yongqian Sun,Dan Pei
DOI: https://doi.org/10.1145/3691620.3695535
2024-01-01
Abstract:As modern software systems evolve towards greater complexity, ensuring their reliable operation has become a critical challenge. Log data analysis is vital in maintaining system stability, with anomaly detection being a key aspect. However, existing log anomaly detection methods heavily rely on manual effort from experts, lacking transferability across systems. This has led to the situation where to perform anomaly detection on a new dataset, the operators must have a high level of understanding of the dataset, make multiple attempts, and spend a lot of time to deploy an algorithm that performs well successfully. This paper proposes LogCraft, an end-to-end unsupervised log anomaly detection framework based on automated machine learning (AutoML). LogCraft automates feature engineering, model selection, and anomaly detection, reducing the need for specialized knowledge and lowering the threshold for algorithm deployment. Extensive evaluations on five public datasets demonstrate LogCraft's effectiveness, achieving an average F1 score of 0.899, which outperforms the second-best average F1 score of 0.847 obtained by existing unsupervised algorithms. According to our knowledge, LogCraft is the first attempt to extract fixed-dimensional vectors as latent representations from a complete log dataset. The proposed meta-feature extractor also exhibits promising potential for measuring log dataset similarity and guiding future log analytics research.
What problem does this paper attempt to address?