Abstract:Massive key performance indicators (KPIs) are monitored as multivariate time series data (MTS) to ensure the reliability of the software applications and service system. Accurately detecting the abnormality of MTS is very critical for subsequent fault elimination. The scarcity of anomalies and manual labeling has led to the development of various self-supervised MTS anomaly detection (AD) methods, which optimize an overall objective/loss encompassing all metrics' regression objectives/losses. However, our empirical study uncovers the prevalence of conflicts among metrics' regression objectives, causing MTS models to grapple with different losses. This critical aspect significantly impacts detection performance but has been overlooked in existing approaches. To address this problem, by mimicking the design of multi-gate mixture-of-experts (MMoE), we introduce CAD, a Conflict-aware multivariate KPI Anomaly Detection algorithm. CAD offers an exclusive structure for each metric to mitigate potential conflicts while fostering inter-metric promotions. Upon thorough investigation, we find that the poor performance of vanilla MMoE mainly comes from the input-output misalignment settings of MTS formulation and convergence issues arising from expansive tasks. To address these challenges, we propose a straightforward yet effective task-oriented metric selection and p&s (personalized and shared) gating mechanism, which establishes CAD as the first practicable multi-task learning (MTL) based MTS AD model. Evaluations on multiple public datasets reveal that CAD obtains an average F1-score of 0.943 across three public datasets, notably outperforming state-of-the-art methods. Our code is accessible at <a class="link-external link-https" href="https://github.com/dawnvince/MTS_CAD" rel="external noopener nofollow">this https URL</a>.

Rapid Deployment of Anomaly Detection Models for Large Number of Emerging KPI Streams.

Robust and Unsupervised KPI Anomaly Detection Based on Highly Sensitive Conditional Variational Auto-Encoders.

An Unsupervised Framework for Anomaly Detection in a Water Treatment System

Robust KPI Anomaly Detection for Large-Scale Software Services with Partial Labels

AutoKAD: Empowering KPI Anomaly Detection with Label-Free Deployment

Intelligent Detection for Key Performance Indicators in Industrial-Based Cyber-Physical Systems

Unsupervised Anomaly Detection for Intricate KPIs via Adversarial Training of VAE

Practical and White-Box Anomaly Detection through Unsupervised and Active Learning

Little Help Makes a Big Difference: Leveraging Active Learning to Improve Unsupervised Time Series Anomaly Detection

Efficient KPI Anomaly Detection Through Transfer Learning for Large-Scale Web Services

Accurate Anomaly Detection Leveraging Knowledge-enhanced GAT

Robust and Rapid Clustering of KPIs for Large-Scale Anomaly Detection

MTAD: Tools and Benchmarks for Multivariate Time Series Anomaly Detection

Probabilistic Temporal Fusion Transformers for Large-Scale KPI Anomaly Detection

Automatic and Generic Periodicity Adaptation for KPI Anomaly Detection

AD 2 S: Adaptive anomaly detection on sporadic data streams

Pre-trained KPI Anomaly Detection Model Through Disentangled Transformer

Opprentice: Towards Practical And Automatic Anomaly Detection Through Machine Learning

An Anomaly Detection Algorithm Selection Service for IoT Stream Data Based on Tsfresh Tool and Genetic Algorithm

Beyond Sharing: Conflict-Aware Multivariate Time Series Anomaly Detection

Label-Less: A Semi-Automatic Labelling Tool for KPI Anomalies.