ImDiffusion: Imputed Diffusion Models for Multivariate Time Series Anomaly Detection

Yuhang Chen,Chaoyun Zhang,Minghua Ma,Yudong Liu,Ruomeng Ding,Bowen Li,Shilin He,Saravan Rajmohan,Qingwei Lin,Dongmei Zhang
2023-11-14
Abstract:Anomaly detection in multivariate time series data is of paramount importance for ensuring the efficient operation of large-scale systems across diverse domains. However, accurately detecting anomalies in such data poses significant challenges. Existing approaches, including forecasting and reconstruction-based methods, struggle to address these challenges effectively. To overcome these limitations, we propose a novel anomaly detection framework named ImDiffusion, which combines time series imputation and diffusion models to achieve accurate and robust anomaly detection. The imputation-based approach employed by ImDiffusion leverages the information from neighboring values in the time series, enabling precise modeling of temporal and inter-correlated dependencies, reducing uncertainty in the data, thereby enhancing the robustness of the anomaly detection process. ImDiffusion further leverages diffusion models as time series imputers to accurately capturing complex dependencies. We leverage the step-by-step denoised outputs generated during the inference process to serve as valuable signals for anomaly prediction, resulting in improved accuracy and robustness of the detection process. We evaluate the performance of ImDiffusion via extensive experiments on benchmark datasets. The results demonstrate that our proposed framework significantly outperforms state-of-the-art approaches in terms of detection accuracy and timeliness. ImDiffusion is further integrated into the real production system in Microsoft and observe a remarkable 11.4% increase in detection F1 score compared to the legacy approach. To the best of our knowledge, ImDiffusion represents a pioneering approach that combines imputation-based techniques with time series anomaly detection, while introducing the novel use of diffusion models to the field.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the anomaly detection problem in multivariate time - series (MTS) data. Specifically, the author points out that in ensuring the efficient operation of large - scale systems across multiple domains, accurate detection of anomalies in multivariate time - series data is crucial. However, existing methods (such as prediction and reconstruction methods) are difficult to effectively meet this challenge because they cannot accurately model complex time - series data. ### Problem Background 1. **Complexity and Uncertainty**: The complexity of modern large - scale systems results in the generation of multi - dimensional, heterogeneous time - series data with complex spatio - temporal correlations, which increases the difficulty of modeling. 2. **Limitations of Existing Methods**: - **Prediction Methods**: Predict future values through historical data and use prediction errors for anomaly detection. But in complex real - world systems, the high uncertainty and variability of future values make it difficult for this method to predict accurately. - **Reconstruction Methods**: Encode the entire time - series into an embedding space and infer anomaly labels based on reconstruction errors. These methods rely on the capabilities of the reconstruction model and may encounter challenges when dealing with heterogeneous, complex, and inter - related data. ### Solution To overcome these limitations, the author proposes a new anomaly detection framework - ImDiffusion, which combines time - series imputation and diffusion models to achieve accurate and robust anomaly detection. The main features of ImDiffusion include: 1. **Imputation Method**: Use neighboring values in the time - series as additional conditional information to more accurately model time and cross - correlation dependencies, reduce uncertainty in the data, and thus enhance the robustness of the detection process. 2. **Diffusion Model**: Use the signal generated by step - by - step denoising of the output for anomaly prediction, which improves the accuracy and robustness of detection. 3. **Innovations**: - **Combination of Imputation and Diffusion Models**: The imputation method can better capture complex correlations and randomly model the time - series through the noise/denoising process. - **Integration Technique**: Use the step - by - step denoising output of the diffusion model as an additional signal, and further improve the accuracy and robustness of inference through integration techniques. ### Experimental Verification The author has verified the performance of ImDiffusion on multiple benchmark datasets through extensive experiments. The results show that this framework is significantly superior to existing methods, especially in terms of detection accuracy and timeliness. In addition, ImDiffusion has been applied in Microsoft's actual production system, and an 11.4% increase in the F1 score has been observed, which significantly improves the reliability of the system. ### Summary ImDiffusion provides a novel and effective method for multivariate time - series anomaly detection by combining time - series imputation and diffusion models, and solves the shortcomings of existing methods in complex data modeling.