Abstract:As the digital world becomes the main complement to the physical world, establishing a solid line of defense against cyber attacks becomes critical and arduous. The intrusion detection systems (IDSs) based on the supervised learning method have achieved excellent performance, which requires a large amount of labeled data in the training phase. However, attacks occur much less frequently than normal behaviors, and it is difficult to obtain accurate labels. In addition, IDSs based on supervised learning cannot identify unknown attacks. At the same time, the problem that detection accuracy varies greatly with different applications is very significant in traditional unsupervised learning methods. Therefore, it is necessary to perform high-precision anomaly detection on unlabeled samples. This paper proposes a traffic anomaly detection model using K-means and Active Learning Method (ALM), which is mainly composed of a feature extraction module and an anomaly detection module. Firstly, the Pearson correlation coefficient and Light Gradient Boosting Machine (LightGBM) are used in the feature extraction module to select important features. Secondly, K-means divides the characteristic-processed traffic into normal or abnormal categories. Finally, the results of K-means are diffused through ALM, and new classification results are obtained after defuzzification, thereby improving the accuracy of anomaly detection. The latest CICDDoS2019 data set is used in the experiment. Experimental results show that the detection accuracy of the proposed model is above 90%, and the F1 score is above 95%, regardless of whether it is a binary classification of a single attack or a mixed classification of multiple attacks. Compared with three unsupervised learning methods K-means, Auto-encoder and short-term memory (LSTM) and three supervised learning methods Naive Bayes (NB), Support Vector Machine (SVM), Decision Tree (DT), the proposed model has higher classification accuracy and better generalization effect. This article is very helpful for exploring the application of unsupervised learning methods in network intrusion detection systems based on the characteristics of the data itself.

Compared Insights on Machine-Learning Anomaly Detection for Process Control Feature

Anomaly Detection Algorithm Based on Electric Equipment

State-Based Control Feature Extraction for Effective Anomaly Detection in Process Industries

Recent Advances in Machine Learning-based Anomaly Detection for Industrial Control Networks

Enhanced Anomaly Detection in Industrial Control Systems aided by Machine Learning

Anomaly Detection for Industrial Control Operations with Optimized ABC–SVM and Weighted Function Code Correlation Analysis

Function-Aware Anomaly Detection Based on Wavelet Neural Network for Industrial Control Communication

Evaluation and Comparison of Machine-Learning Algorithm for Network Intrusion Detection

A Comparative Study of Time Series Anomaly Detection Models for Industrial Control Systems

Novel Machine Learning Techniques for Anomaly Intrusion Detection.

Design And Analysis Of Multimodel-Based Anomaly Intrusion Detection Systems In Industrial Process Automation

Performance Analysis of Anomaly-Based Network Intrusion Detection Using Feature Selection and Machine Learning Techniques

Anomaly-Based Network Intrusion Detection Using SVM

Machine Learning for Cyber Security: Third International Conference, ML4CS 2020, Guangzhou, China, October 8–10, 2020, Proceedings, Part II

Machine learning for intrusion detection in industrial control systems: challenges and lessons from experimental evaluation

Functional Pattern-Related Anomaly Detection Approach Collaborating Binary Segmentation with Finite State Machine

Comparative Analysis of Intrusion Detection System Using Machine Learning and Deep Learning Algorithms

Anomaly Detection for Industrial Control Networks using Machine Learning with the help from the Inter-Arrival Curves

Traffic Anomaly Detection Model Using K-Means and Active Learning Method

Machine Learning-Based Threat Identification of Industrial Internet

Evaluation of Machine Learning-based Anomaly Detection Algorithms on an Industrial Modbus/TCP Data Set