Abstract:As cyber-attacks become more sophisticated, improving the robustness of Machine Learning (ML) models must be a priority for enterprises of all sizes. To reliably compare the robustness of different ML models for cyber-attack detection in enterprise computer networks, they must be evaluated in standardized conditions. This work presents a methodical adversarial robustness benchmark of multiple decision tree ensembles with constrained adversarial examples generated from standard datasets. The robustness of regularly and adversarially trained RF, XGB, LGBM, and EBM models was evaluated on the original CICIDS2017 dataset, a corrected version of it designated as NewCICIDS, and the HIKARI dataset, which contains more recent network traffic. NewCICIDS led to models with a better performance, especially XGB and EBM, but RF and LGBM were less robust against the more recent cyber-attacks of HIKARI. Overall, the robustness of the models to adversarial cyber-attack examples was improved without their generalization to regular traffic being affected, enabling a reliable detection of suspicious activity without costly increases of false alarms.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: as cyber - attacks become more and more complex, how to improve the robustness of machine learning (ML) models in network intrusion detection. Specifically, the paper aims to provide a systematic method to evaluate the adversarial robustness of different decision - tree - based ensemble models in enterprise computer networks for network - attack detection. By using standardized datasets to generate restricted adversarial samples and evaluating the randomly - forested (RF), XGBoost (XGB), LightGBM (LGBM) and Explainable Boosting Machine (EBM) models with regular training and adversarial training, to ensure that the robustness of these models is improved when facing adversarial cyber - attack examples without affecting their generalization ability for normal traffic. ### Main problem summary: 1. **Improving the robustness of ML models**: With the advancement of cyber - attack techniques, traditional ML models may not be able to effectively cope with new attack means. The goal of the paper is to improve the robustness of ML models through methods such as adversarial training. 2. **Standardizing evaluation conditions**: In order to reliably compare the robustness of different ML models, evaluation must be carried out under standardized conditions. Different studies adopt different evaluation methods, which makes it difficult to determine which models are most suitable for specific enterprise network environments. 3. **The influence of adversarial samples**: The paper explores the influence of adversarial samples (i.e., malicious inputs carefully designed to evade detection) on the performance of ML models and proposes a method to enhance the robustness of models through adversarial training. 4. **Dataset selection and processing**: The paper uses three standard datasets (CICIDS2017, NewCICIDS and HIKARI) to evaluate the robustness of models. These datasets represent network traffic at different time periods and of different types, which helps to more comprehensively understand the performance of models. ### Formula representation: - The adversarial - sample - generation method (such as A2PM) involves a small perturbation of feature values, and the formula can be represented as: \[ x_{\text{adv}}=x + \delta \] where \(x\) is the original sample, \(\delta\) is the added perturbation, and \(x_{\text{adv}}\) is the generated adversarial sample. - In adversarial training, the loss function usually combines the losses of the original sample and the adversarial sample, and the formula can be represented as: \[ L_{\text{adv}}=L(x, y)+L(x_{\text{adv}}, y) \] where \(L(x, y)\) is the loss of the original sample and \(L(x_{\text{adv}}, y)\) is the loss of the adversarial sample. Through this method, the paper hopes to provide enterprises with a reliable benchmark - testing method to select the ML model that best suits their network - security needs.

An Adversarial Robustness Benchmark for Enterprise Network Intrusion Detection

Attack As Defense: Characterizing Adversarial Examples Using Robustness.

Reliable Feature Selection for Adversarially Robust Cyber-Attack Detection

Extracting Robust Models with Uncertain Examples

ROBY: Evaluating the adversarial robustness of a deep model by its decision boundaries

Evaluating and Improving Adversarial Robustness of Machine Learning-Based Network Intrusion Detectors

MultiRobustBench: Benchmarking Robustness Against Multiple Attacks

Towards Adversarial Realism and Robust Learning for IoT Intrusion Detection and Classification

RobEns: Robust Ensemble Adversarial Machine Learning Framework for Securing IoT Traffic

Adversarial Attacks on ML Defense Models Competition

Improving Machine Learning Robustness via Adversarial Training

Exploring the Robustness of Decision-Level Through Adversarial Attacks on LLM-Based Embodied Models

Testing Robustness Against Unforeseen Adversaries

Is RobustBench/AutoAttack a suitable Benchmark for Adversarial Robustness?

An Adversarial Approach to Evaluating the Robustness of Event Identification Models

Benchmarking Adversarial Robustness on Image Classification

Adversarial Robust Decision-Making under Uncertainty Learning and Dynamic Ensemble Selection

On the Robustness of Adversarial Training Against Uncertainty Attacks

A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking

Adversarial robustness of deep reinforcement learning-based intrusion detection