Abstract:In this paper, we address the problem of unsupervised video anomaly detection (UVAD). The task aims to detect abnormal events in test video using unlabeled videos as training data. The presence of anomalies in the training data poses a significant challenge in this task, particularly because they form clusters in the feature space. We refer to this property as the "Anomaly Cluster" issue. The condensed nature of these anomalies makes it difficult to distinguish between normal and abnormal data in the training set. Consequently, training conventional anomaly detection techniques using an unlabeled dataset often leads to sub-optimal results. To tackle this difficulty, we propose a new method called Cleansed k-Nearest Neighbor (CKNN), which explicitly filters out the Anomaly Clusters by cleansing the training dataset. Following the k-nearest neighbor algorithm in the feature space provides powerful anomaly detection capability. Although the identified Anomaly Cluster issue presents a significant challenge to applying k-nearest neighbor in UVAD, our proposed cleansing scheme effectively addresses this problem. We evaluate the proposed method on various benchmark datasets and demonstrate that CKNN outperforms the previous state-of-the-art UVAD method by up to 8.5% (from 82.0 to 89.0) in terms of AUROC. Moreover, we emphasize that the performance of the proposed method is comparable to that of the state-of-the-art method trained using anomaly-free data.

What problem does this paper attempt to address?

The paper primarily addresses the issue of Unsupervised Video Anomaly Detection (UVAD), aiming to improve the performance of detecting anomalous events in unannotated training data. The core contribution of the paper is the proposal of a new method—the Cleansed k-Nearest Neighbor (CKNN) algorithm—to tackle the "anomaly clustering" problem in UVAD. In UVAD tasks, since the training data may contain anomalous events, this leads to the formation of clusters of anomalous data in the feature space, which interferes with the distinction between normal and anomalous data, posing challenges to traditional anomaly detection techniques. Specifically, CKNN addresses the problem through the following steps: 1. **Identifying the "anomaly clustering" problem**: The paper first defines and points out the phenomenon of dense clustering of anomalous data in video anomaly detection, which is difficult to distinguish from normal data in the feature space. 2. **Proposing a cleansing scheme**: To solve the above problem, CKNN adopts a method called "object cleansing," which evaluates the pseudo-anomaly score of each object and removes those with higher scores, thereby reducing the anomaly clusters in the training data. 3. **Adopting the k-Nearest Neighbor algorithm**: The k-Nearest Neighbor algorithm is applied to the cleansed dataset for anomaly detection. This method can effectively identify anomalous events in the test data. 4. **Experimental validation**: Through experimental evaluation on multiple benchmark datasets, it is demonstrated that CKNN outperforms existing methods, especially when dealing with training sets containing anomalous data, significantly surpassing traditional reconstruction loss methods. In summary, the main goal of the paper is to improve the accuracy of unsupervised video anomaly detection by developing a new cleansed k-Nearest Neighbor algorithm, particularly when facing dense clusters of anomalous events in the training data.

CKNN: Cleansed k-Nearest Neighbor for Unsupervised Video Anomaly Detection