Multivariate Log-based Anomaly Detection for Distributed Database

Lingzhe Zhang,Tong Jia,Mengxi Jia,Ying Li,Yong Yang,Zhonghai Wu

DOI: https://doi.org/10.1145/3637528.3671725

2024-06-12

Abstract:Distributed databases are fundamental infrastructures of today's large-scale software systems such as cloud systems. Detecting anomalies in distributed databases is essential for maintaining software availability. Existing approaches, predominantly developed using Loghub-a comprehensive collection of log datasets from various systems-lack datasets specifically tailored to distributed databases, which exhibit unique anomalies. Additionally, there's a notable absence of datasets encompassing multi-anomaly, multi-node logs. Consequently, models built upon these datasets, primarily designed for standalone systems, are inadequate for distributed databases, and the prevalent method of deeming an entire cluster anomalous based on irregularities in a single node leads to a high false-positive rate. This paper addresses the unique anomalies and multivariate nature of logs in distributed databases. We expose the first open-sourced, comprehensive dataset with multivariate logs from distributed databases. Utilizing this dataset, we conduct an extensive study to identify multiple database anomalies and to assess the effectiveness of state-of-the-art anomaly detection using multivariate log data. Our findings reveal that relying solely on logs from a single node is insufficient for accurate anomaly detection on distributed database. Leveraging these insights, we propose MultiLog, an innovative multivariate log-based anomaly detection approach tailored for distributed databases. Our experiments, based on this novel dataset, demonstrate MultiLog's superiority, outperforming existing state-of-the-art methods by approximately 12%.

Software Engineering

What problem does this paper attempt to address?

The paper aims to address the issue of anomaly detection in distributed databases, particularly focusing on the shortcomings of existing methods when dealing with distributed database logs. Specifically: 1. **Lack of log anomaly datasets specifically for distributed databases**: Existing log anomaly detection datasets (such as Loghub) mainly come from standalone systems or distributed file systems and are not specifically designed for distributed databases. Therefore, they cannot fully reflect the unique anomalies of distributed databases. 2. **Lack of datasets containing multi-type, multi-node logs**: Most existing datasets fail to cover multiple types of anomaly injections and usually only contain logs from a single source, which cannot reflect the interconnected nature of multi-node distributed databases. 3. **Limitations of existing models in applying to distributed databases**: Current models are mainly designed for standalone systems. When applied to distributed databases, they typically determine whether the entire cluster is abnormal through single-point classification, which can lead to a high false positive rate. To address these issues, the authors constructed a new large-scale dataset and proposed a multivariate log anomaly detection method called MultiLog. MultiLog collects sequential information, quantitative information, and semantic information from each node, encodes them using an LSTM enhanced with a self-attention mechanism, and finally determines the state of the entire cluster through a cluster classifier that combines an AutoEncoder and a meta-classifier. Experimental results show that MultiLog improves performance by approximately 12% in multi-node classification tasks and over 16% in single-node anomaly detection compared to existing methods.

Multivariate Log-based Anomaly Detection for Distributed Database

Distributed system anomaly detection using deep learning‐based log analysis

Big-data Analysis of Multi-Source Logs for Anomaly Detection on Network-Based System.

LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs

Distributed Online One-Class Support Vector Machine for Anomaly Detection over Networks

Development of Anomaly Detection System Based on Distributed Log Tracing

An Integrated Method for Anomaly Detection From Massive System Logs.

An Approach for Anomaly Diagnosis Based on Hybrid Graph Model with Logs for Distributed Services

An Empirical Analysis of Anomaly Detection Methods for Multivariate Time Series

Robust Log-Based Anomaly Detection on Unstable Log Data

MoniLog: An Automated Log-Based Anomaly Detection System for Cloud Computing Infrastructures

Natural Language Processing-based Model for Log Anomaly Detection

Log-based Anomaly Detection of Enterprise Software: An Empirical Study

Log-based Anomaly Detection based on EVT Theory with feedback

Log‐based anomaly detection for distributed systems: State of the art, industry experience, and open issues

Log2graphs: An Unsupervised Framework for Log Anomaly Detection with Efficient Feature Extraction

Experience Report: Deep Learning-based System Log Analysis for Anomaly Detection

Anomaly Detection using Distributed Log Data: A Lightweight Federated Learning Approach

Combining K-Means and XGBoost Models for Anomaly Detection Using Log Datasets

End-to-End AutoML for Unsupervised Log Anomaly Detection

MDFULog: Multi-Feature Deep Fusion of Unstable Log Anomaly Detection Model