Latent Gaussian process for anomaly detection in categorical data

Fengmao Lv,Tao Liang,Jiayi Zhao,Zhongliu Zhuo,Jinzhao Wu,Guowu Yang
DOI: https://doi.org/10.1016/j.knosys.2021.106896
IF: 8.139
2021-01-01
Knowledge-Based Systems
Abstract:We propose a semi-supervised approach towards anomaly detection in multivariate categorical data. Our goal is to learn a model that can distinguish the anomalous data, given a small set of training data from the normal class. To this end, our approach learns the probability distribution of normal instances with the assumption that the categorical data are generated from a continuous latent space. Gaussian process is adopted to construct the generative model. As a non-parametric Bayesian model, Gaussian process can adapt its model complexity according to the data size. Hence, our approach can be effective when the training dataset is small. Comprehensive experiments over different benchmarks clearly demonstrate the effectiveness of our approach. (c) 2021 Elsevier B.V. All rights reserved.
What problem does this paper attempt to address?