A Semi-supervised Approach for Industrial Anomaly Detection via Self-adaptive Clustering
Xiaoxue Ma,Jacky Keung,Pinjia He,Yan Xiao,Xiao Yu,Yishu Li
DOI: https://doi.org/10.1109/tii.2023.3280246
IF: 12.3
2023-01-01
IEEE Transactions on Industrial Informatics
Abstract:With the rapid development of the Industrial Internet of Things (IIoT), log-based anomaly detection has become vital for smart industrial construction that has prompted many researchers to contribute. To detect anomalies based on log data, semi-supervised approaches stand out from supervised and unsupervised approaches because they only require a portion of labeled data and are relatively stable. However, the state-of-the-art semi-supervised approaches still suffer from two main problems: manual parameter setting and unsatisfactory performance with high false positives. We propose AdaLog, an integrated semi-supervised approach based on self-adaptive clustering, for industrial anomaly detection. In particular, the clustering step performs automatic label probability estimation by distinguishing twelve situations so that the label probability of each unlabeled data can be carefully calculated, leading to high accuracy. In addition, AdaLog employs a pre-trained model to learn contextual information comprehensively and a transformer-based model to detect anomalies efficiently. To alleviate class imbalance, an undersampling method is incorporated. The results on three popular datasets demonstrate that AdaLog significantly outperforms three state-of-the-art semi-supervised approaches by 17.8%–2,489.8% on average in terms of F1-score, and is even superior to two supervised approaches in most cases with average improvements of 10.9%–23.8%.
automation & control systems,computer science, interdisciplinary applications,engineering, industrial