Unsupervised Dictionary Learning for Anomaly Detection

Paul Irofti,Andra Băltoiu
DOI: https://doi.org/10.48550/arXiv.2003.00293
2020-09-07
Abstract:We investigate the possibilities of employing dictionary learning to address the requirements of most anomaly detection applications, such as absence of supervision, online formulations, low false positive rates. We present new results of our recent semi-supervised online algorithm, TODDLeR, on a anti-money laundering application. We also introduce a novel unsupervised method of using the performance of the learning algorithm as indication of the nature of the samples.
Machine Learning,Cryptography and Security,Computer Vision and Pattern Recognition,Numerical Analysis
What problem does this paper attempt to address?
This paper aims to solve several key problems in anomaly detection, especially how to effectively identify anomalies when the supervision information is limited or completely unsupervised. Specifically, the paper attempts to solve the following problems: 1. **Lack of supervision information**: In many anomaly detection application scenarios, not all anomaly types can be known in advance, so it is necessary to be able to identify new, unseen anomaly types during the operation process. Traditional supervised methods perform poorly in this case. 2. **Requirement for online processing**: Anomaly detection usually involves a large amount of data, among which only a few samples are anomalies. The classical Dictionary Learning (DL) method is difficult to deal with such large - scale data sets, so it is necessary to develop methods that can process data online. 3. **Low false positive rate**: In practical applications, such as anti - money laundering and fraud detection, the false positive rate must be as low as possible to ensure the reliability of the system. To solve these problems, the paper proposes two anomaly detection methods based on the dictionary - learning framework: - **Semi - supervised online algorithm (TODDLeR)**: This method allows online learning and classification of newly arrived signals under the premise of having a small amount of labeled data. It combines the classifier and the label - consistency dictionary by expanding the classical dictionary - learning objective function, thereby improving the ability to identify new anomaly types. Its optimization objective can be expressed as: \[ \min_{D,W,A} \|y - Dx\|^2_2+\alpha \|h - Wx\|^2_2+\beta \|q - Ax\|^2_2+\lambda_1 \|W - W_0\|^2_F+\lambda_2 \|A - A_0\|^2_F \] where \(y\) is the newly arrived signal, \(D\) is the dictionary, \(W\) and \(A\) are the linear classifier and the label - consistency dictionary respectively, \(h\) and \(q\) are the estimated label and atom - assignment matrices, and \(\lambda_1\) and \(\lambda_2\) are regularization parameters. - **Unsupervised method**: This method is applicable to the situation where there are no labels at all. It infers the nature of samples by using performance indicators in the dictionary - learning process. Specifically, by gradually filtering out those signals that are less likely to be anomalies, the potential anomaly set is finally determined. For example, normal samples and anomaly samples can be distinguished by calculating the representation error or the atom popularity. These methods have been tested in practical application scenarios such as financial fraud detection, showing good performance and a low false positive rate.