ROSpace: Intrusion Detection Dataset for a ROS2-Based Cyber-Physical System

Tommaso Puccetti,Simone Nardi,Cosimo Cinquilli,Tommaso Zoppi,Andrea Ceccarelli
2024-02-13
Abstract:Most of the intrusion detection datasets to research machine learning-based intrusion detection systems (IDSs) are devoted to cyber-only systems, and they typically collect data from one architectural layer. Additionally, often the attacks are generated in dedicated attack sessions, without reproducing the realistic alternation and overlap of normal and attack actions. We present a dataset for intrusion detection by performing penetration testing on an embedded cyber-physical system built over Robot Operating System 2 (ROS2). Features are monitored from three architectural layers: the Linux operating system, the network, and the ROS2 services. The dataset is structured as a time series and describes the expected behavior of the system and its response to ROS2-specific attacks: it repeatedly alternates periods of attack-free operation with periods when a specific attack is being performed. Noteworthy, this allows measuring the time to detect an attacker and the number of malicious activities performed before detection. Also, it allows training an intrusion detector to minimize both, by taking advantage of the numerous alternating periods of normal and attack operations.
Machine Learning
What problem does this paper attempt to address?
The paper aims to address the issue of insufficient intrusion detection datasets for embedded Cyber-Physical Systems (CPS). Specifically, most existing intrusion detection datasets primarily focus on pure network systems and typically collect data from a single architectural layer. Additionally, these datasets often generate attacks in dedicated attack sessions, failing to reproduce the real alternation and overlap between normal operations and attack behaviors. Therefore, this paper proposes a dataset named ROSPaCe, which is specifically designed for penetration testing of Cyber-Physical Systems based on Robot Operating System 2 (ROS2). The characteristics of the ROSPaCe dataset are as follows: 1. **Multi-layer Monitoring**: Features are collected from three architectural layers, including the Linux operating system, network, and ROS2 services. 2. **Real Execution Environment**: The dataset is recorded in an actual running environment, with no simulated traffic or user behavior. 3. **Dynamic Alternation Pattern**: The dataset is organized in a time-series format, alternating between normal operations and attack operations to simulate scenarios where attackers attempt to penetrate the system. 4. **Specific Attacks**: It includes specific attacks against ROS2, such as Discovery and Denial of Service (DoS) attacks, with a total of 6 different attack types. 5. **Large-scale Data**: The final version of the ROSPaCe dataset contains over 30 million data points, with 482 features, and a total size of approximately 40.5GB. With these characteristics, the ROSPaCe dataset can provide valuable resources for training and evaluating Intrusion Detection Systems (IDSs) for embedded Cyber-Physical Systems, particularly those based on ROS2.