GradStop: Exploring Training Dynamics in Unsupervised Outlier Detection through Gradient Cohesion

Yuang Zhang,Liping Wang,Yihong Huang,Yuanxing Zheng
2024-12-12
Abstract:Unsupervised Outlier Detection (UOD) is a critical task in data mining and machine learning, aiming to identify instances that significantly deviate from the majority. Without any label, deep UOD methods struggle with the misalignment between the model's direct optimization goal and the final performance goal of Outlier Detection (OD) task. Through the perspective of training dynamics, this paper proposes an early stopping algorithm to optimize the training of deep UOD models, ensuring they perform optimally in OD rather than overfitting the entire contaminated dataset. Inspired by UOD mechanism and inlier priority phenomenon, where intuitively models fit inliers more quickly than outliers, we propose GradStop, a sampling-based label-free algorithm to estimate model's real-time performance during training. First, a sampling method generates two sets: one likely containing more outliers and the other more inliers, then a metric based on gradient cohesion is applied to probe into current training dynamics, which reflects model's performance on OD task. Experimental results on 4 deep UOD algorithms and 47 real-world datasets and theoretical proofs demonstrate the effectiveness of our proposed early stopping algorithm in enhancing the performance of deep UOD models. Auto Encoder (AE) enhanced by GradStop achieves better performance than itself, other SOTA UOD methods, and even ensemble AEs. Our method provides a robust and effective solution to the problem of performance degradation during training, enabling deep UOD models to achieve better potential in anomaly detection tasks.
Machine Learning
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the performance degradation problem in Unsupervised Outlier Detection (UOD). Specifically, UOD models face the following challenges during the training process: 1. **Inconsistency between the optimization objective and the final performance objective**: There is a deviation between the direct optimization objective of the UOD model (such as minimizing the reconstruction loss) and the final outlier detection performance objective (such as AUC, AP, etc.). This inconsistency may cause the model to over - fit the entire contaminated data set in the later stage of training, thereby reducing the outlier detection performance. 2. **Completely unsupervised training settings**: Since UOD is unsupervised, there is no label information available to verify the real - time performance of the model. Therefore, an effective label - free evaluation metric is required to infer the outlier detection performance of the model. 3. **Complex and diverse training dynamics**: The training dynamics vary greatly under different algorithm and data set combinations, which may lead to various performance patterns, such as continuous improvement/decline, first decline and then rise, or continuous fluctuation, etc. Therefore, a robust algorithm is required to handle these complex training dynamics. To solve these problems, the authors propose the GradStop algorithm, which optimizes the training process of the deep UOD model by analyzing the training dynamics to ensure that it performs best in outlier detection tasks without over - fitting the entire contaminated data set. The specific methods are as follows: - **Sampling method based on gradient condensation**: Generate two data sets, one is more likely to contain more outliers, and the other is more likely to contain more normal points. - **Gradient condensation metric**: Apply a gradient - condensation - based metric to detect the current training dynamics and reflect the performance of the model in outlier detection tasks. - **Automated early - stopping algorithm**: Automatically decide whether to stop training at each epoch according to the above metric. Experimental results show that GradStop can effectively prevent performance degradation and significantly improve the detection performance of AE and other deep UOD models. ### Formula summary - **Loss function calculation**: \[ L(M;B)=\frac{1}{|B|}\sum_{x\in B}J_M(x)=\frac{1}{|B|}\sum_{x\in B}f_M(x)=\frac{1}{|B|}\sum_{i}v_i \] where \(J_M(\cdot)\) represents the unsupervised loss function of model \(M\), and \(L\) is the loss value used to update the model parameters. - **Objective function**: \[ P(v^- < v^+) = P(f_M(x_{\text{in}})<f_M(x_{\text{out}})|x_{\text{in}}\sim X_{\text{in}},x_{\text{out}}\sim X_{\text{out}}) \] The objective is to maximize this probability so that data points from the normal distribution have lower outlier scores, while data points from the abnormal distribution have higher outlier scores. - **AUC calculation**: \[ AUC(M,D)=\frac{1}{|D_{\text{in}}||D_{\text{out}}|}\sum_{x_i\in D_{\text{in}}}\sum_{x_j\in D_{\text{out}}}I(f_M(x_i)<f_M(x_j)) \] where \(I\) is an indicator function, which returns 1 when the condition is met, and 0 otherwise. Through these formulas and methods, GradStop can effectively monitor and optimize the training process of the deep UOD model and improve its performance in outlier detection tasks.