An Effective Feature Selection Algorithm for Machine Learning-based Malicious Traffic Detection
Chao Fei,Nian Xia,Pang-Wei Tsai,Yang Lu,Xiaonan Pan,Junli Gong
DOI: https://doi.org/10.1109/asiajcis64263.2024.00024
2024-01-01
Abstract:Malicious traffic detection is important to defend network attacks. Traditional machine learning-based malicious traffic detection methods aim to improve detection performance and ignore resource consumption. For Internet of Things devices with constrained resources, reducing resource consumption in traffic detection while ensuring the detection accuracy is challenging. In order to solve this problem, this paper proposed an effective feature selection algorithm based on chi-squared test (EFS-CST), which selects the most relevant features to train machine learning (ML) models like random forest, naive bayes, decision tree, and convolutional neural networks. The proposed algorithm was evaluated in terms of accuracy, precision, F1 score, AUROC (area under the receiver operating characteristic curve), AUC-PR (area under the precision-recall curve), model size, training time, and dataset size. Results proved that ML models with EFS-CST could obtain similar detection performance compared to ML models with all features on both UNSW-NBls and TII-SSRC-23 datasets. The detection performance of some ML models with EFS-CST could even outperform that of ML models without feature selection. The dataset sizes for TII-SSRC-23 and UNSW-NB15 were reduced by 48.8% and 47.8%, respectively. In addition, the model training time for TII-SSRC-23 and UNSW-NB15 datasets can be reduced by up to 86.1 % and 48.9%, respectively. Finally, the model sizes for TII-SSRC-23 and UNSW-NB15 datasets could be reduced by up to 60.0% and 65.6 %” respectively.