On Equivalence of Anomaly Detection Algorithms

Carlos Ivan Jerez,Jun Zhang,Marcia R. Silva
DOI: https://doi.org/10.1145/3536428
IF: 4.157
2023-01-01
ACM Transactions on Knowledge Discovery from Data
Abstract:In most domains, anomaly detection is typically cast as an unsupervised learning problem because of the infeasibility of labeling large datasets. In this setup, the evaluation and comparison of different anomaly detection algorithms is difficult. Although some work has been published in this field, they fail to account that different algorithms can detect different kinds of anomalies. More precisely, the literature on this topic has focused on defining criteria to determine which algorithm is better, while ignoring the fact that such criteria are meaningful only if the algorithms being compared are detecting the same kind of anomalies. Therefore, in this article, we propose an equivalence criterion for anomaly detection algorithms that measures to what degree two anomaly detection algorithms detect the same kind of anomalies. First, we lay out a set of desirable properties that such an equivalence criterion should have and why; second, we propose Gaussian Equivalence Criterion (GEC) as equivalence criterion and show mathematically that it has the desirable properties previously mentioned. Finally, we empirically validate these properties using a simulated and a real-world dataset. For the real-world dataset, we show how GEC can provide insight about the anomaly detection algorithms as well as the dataset.
What problem does this paper attempt to address?