Anomaly Detection using One-Class Neural Networks

Raghavendra Chalapathy,Aditya Krishna Menon,Sanjay Chawla
DOI: https://doi.org/10.48550/arXiv.1802.06360
2019-01-11
Abstract:We propose a one-class neural network (OC-NN) model to detect anomalies in complex data sets. OC-NN combines the ability of deep networks to extract a progressively rich representation of data with the one-class objective of creating a tight envelope around normal data. The OC-NN approach breaks new ground for the following crucial reason: data representation in the hidden layer is driven by the OC-NN objective and is thus customized for anomaly detection. This is a departure from other approaches which use a hybrid approach of learning deep features using an autoencoder and then feeding the features into a separate anomaly detection method like one-class SVM (OC-SVM). The hybrid OC-SVM approach is sub-optimal because it is unable to influence representational learning in the hidden layers. A comprehensive set of experiments demonstrate that on complex data sets (like CIFAR and GTSRB), OC-NN performs on par with state-of-the-art methods and outperformed conventional shallow methods in some scenarios.
Machine Learning,Neural and Evolutionary Computing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **Detecting outliers in complex datasets**. Specifically, the author proposes a new one - class classification model based on neural networks (One - Class Neural Network, OC - NN) to improve the performance of outlier detection in high - dimensional complex datasets. Traditional methods such as one - class support vector machines (OC - SVM) perform poorly when dealing with complex, high - dimensional data, while existing deep - learning methods usually adopt hybrid models, that is, first extract features through auto - encoders, and then input these features into separate outlier detection algorithms. This method fails to optimize the representation learning of hidden layers. ### Main problems and challenges 1. **Outlier detection in complex datasets**: - Existing shallow methods (such as OC - SVM) have poor performance when dealing with complex, high - dimensional data. - Although deep - learning methods are effective, the existing hybrid models fail to fully utilize the representation ability of deep networks. 2. **Combination of representation learning and outlier detection**: - Existing methods fail to closely combine representation learning with outlier detection goals, resulting in sub - optimal feature extraction. 3. **Efficient and accurate outlier detection**: - It is necessary to achieve efficient and accurate outlier detection on complex datasets while maintaining reasonable training and testing times. ### Solutions The OC - NN model proposed by the author solves the above problems in the following ways: - **Combining the representation ability of deep learning and one - class classification goals**: OC - NN not only utilizes the ability of deep networks to extract rich data representations, but also optimizes the representation learning of hidden layers through one - class classification goals (similar to the objective function of OC - SVM). - **Customized representation learning**: The representation learning of OC - NN is driven by its one - class classification goals, so it is more suitable for outlier detection tasks. - **Experimental verification**: Through extensive experiments on multiple complex datasets (such as CIFAR - 10 and GTSRB), it is proved that OC - NN is superior to existing methods in some cases and performs well on complex datasets. ### Formula representation The objective function of OC - NN can be expressed as: \[ \min_{w,V,r} \frac{1}{2}\|w\|^2_2 + \frac{1}{2}\|V\|^2_F + \frac{1}{\nu} \cdot \frac{1}{N} \sum_{n = 1}^N \max(0, r - \langle w, g(VX_n)\rangle) - r \] where: - \(w\) is the weight from the hidden layer to the output layer. - \(V\) is the weight matrix from the input layer to the hidden layer. - \(r\) is the bias term. - \(g(\cdot)\) is the activation function (for example, linear or sigmoid). - \(X_n\) is the input data point. Through this optimization objective, OC - NN can more effectively identify outliers in complex datasets.