Layer-constrained Variational Autoencoding Kernel Density Estimation Model for Anomaly Detection.

Peng Lv,Yanwei Yu,Yangyang Fan,Xianfeng Tang,Xiangrong Tong
DOI: https://doi.org/10.1016/j.knosys.2020.105753
IF: 8.139
2020-01-01
Knowledge-Based Systems
Abstract:Unsupervised techniques typically rely on the probability density distribution of the data to detect anomalies, where objects with low probability density are considered to be abnormal. However, modeling the density distribution of high dimensional data is known to be hard, making the problem of detecting anomalies from high-dimensional data challenging. The state-of-the-art methods solve this problem by first applying dimension reduction techniques to the data and then detecting anomalies in the low dimensional space. Unfortunately, the low dimensional space does not necessarily preserve the density distribution of the original high dimensional data. This jeopardizes the effectiveness of anomaly detection. In this work, we propose a novel high dimensional anomaly detection method called LAKE. The key idea of LAKE is to unify the representation learning capacity of layer-constrained variational autoencoder with the density estimation power of kernel density estimation (KDE). Then a probability density distribution of the high dimensional data can be learned, which is able to effectively separate the anomalies out. LAKE successfully consolidates the merits of the two worlds, namely layer-constrained variational autoencoder and KDE by using a probability density-aware strategy in the training process of the autoencoder. Extensive experiments on six public benchmark datasets demonstrate that our method significantly outperforms the state-of-the-art methods in detecting anomalies and achieves up to 37% improvement in F1 score.
What problem does this paper attempt to address?