Abstract:As cyber attacks grow increasingly sophisticated and stealthy, it becomes more imperative and challenging to detect intrusion from normal behaviors. Through fine-grained causality analysis, provenance-based intrusion detection systems (PIDS) demonstrated a promising capacity to distinguish benign and malicious behaviors, attracting widespread attention from both industry and academia. Among diverse approaches, rule-based PIDS stands out due to its lightweight overhead, real-time capabilities, and explainability. However, existing rule-based systems suffer low detection accuracy, especially the high false alarms, due to the lack of fine-grained rules and environment-specific configurations. In this paper, we propose CAPTAIN, a rule-based PIDS capable of automatically adapting to diverse environments. Specifically, we propose three adaptive parameters to adjust the detection configuration with respect to nodes, edges, and alarm generation thresholds. We build a differentiable tag propagation framework and utilize the gradient descent algorithm to optimize these adaptive parameters based on the training data. We evaluate our system using data from DARPA Engagements and simulated environments. The evaluation results demonstrate that CAPTAIN enhances rule-based PIDS with learning capabilities, resulting in improved detection accuracy, reduced detection latency, lower runtime overhead, and more interpretable detection procedures and results compared to the state-of-the-art (SOTA) PIDS.

What problem does this paper attempt to address?

The main problems that this paper attempts to solve are the low detection accuracy, high false - positive rate, and lack of environmental adaptability faced by existing rule - based intrusion detection systems (PIDS) in actual deployments. Specifically: 1. **Low Detection Accuracy and High False - Positive Rate**: Existing rule - based PIDS are unable to adapt flexibly to different environments due to overly simple and general rules, resulting in detection results that are either too lax (generating a large number of false positives) or too strict (missing real attacks). For example, when dealing with "gray" nodes in cloud services (such as the IP addresses of FaaS platforms), these systems are unable to effectively distinguish between benign and malicious behaviors, leading to false positives or false negatives. 2. **Lack of Environmental Adaptability**: Traditional rule - based PIDS usually rely on static configurations and manual adjustments, and it is difficult to dynamically adjust rules according to the specific environment. This makes the system perform poorly in the face of complex and changeable network environments, especially in security operation centers (SOCs), where analysts need to spend a great deal of time manually configuring models. To solve these problems, the paper proposes a new rule - based PIDS named C APTAIN, which automatically adjusts the rule configuration by introducing adaptive parameters and the gradient - descent optimization algorithm, thereby improving detection accuracy and reducing the false - positive rate. Specifically, C APTAIN introduces three adaptive parameters: - **Label Initialization Parameter (A)**: Used to determine the initial labels of system entities. - **Label Propagation Rate Parameter (G)**: Used to adjust the impact of system events on labels. - **Alarm Generation Threshold Parameter (T)**: Used to adjust the alarm generation rules. Through these adaptive parameters, C APTAIN can automatically learn and optimize the rule configuration during the training process, thereby achieving more accurate detection and response. In addition, C APTAIN also retains the advantages of rule - based PIDS, such as being lightweight, having low latency, and being interpretable. ### Main Contributions of the Paper 1. **Proposing C APTAIN**: A rule - based PIDS that can automatically adjust rules, combining the advantages of traditional rule - based systems (lightweight, low latency, interpretability) and the adaptive capabilities of machine - learning systems. 2. **Designing a Differentiable Label Propagation Framework**: Transforming the rule - based PIDS into a differentiable function and using the gradient - descent algorithm to optimize the adaptive parameters, thereby reducing false positives. 3. **System Evaluation**: Evaluating the performance of C APTAIN in multiple scenarios, including the DARPA data set and the simulated - environment data set. The experimental results show that C APTAIN reduces false positives by more than 90% compared to traditional rule - based PIDS and performs well in terms of detection accuracy, runtime overhead, and latency. Through these improvements, C APTAIN aims to overcome the limitations of existing rule - based PIDS and provide a more flexible and accurate intrusion - detection solution.

Incorporating Gradients to Rules: Towards Lightweight, Adaptive Provenance-based Intrusion Detection

Adversarial Attacks and Mitigation for Anomaly Detectors of Cyber-Physical Systems

A Mixed Intrusion Detection System Utilizing K-means and Extreme Gradient Boosting

PARGMF: A provenance-enabled automated rule generation and matching framework with multi-level attack description model

Extending Signature-based Intrusion Detection Systems WithBayesian Abductive Reasoning

RIDS: Towards Advanced IDS Via RNN Model and Programmable Switches Co-Designed Approaches

Work-in-Progress: Towards Real-Time IDS Via RNN and Programmable Switches Co-Designed Approach

Attribute Selection Based Genetic Network Programming for Intrusion Detection System

APT-KGL: an Intelligent APT Detection System Based on Threat Knowledge and Heterogeneous Provenance Graph Learning

Concept Drift–Based Intrusion Detection For Evolving Data Stream Classification In IDS: Approaches And Comparative Study

Provenance-based Intrusion Detection: Opportunities and Challenges

Explainable Intelligence-Driven Defense Mechanism against Advanced Persistent Threats: A Joint Edge Game and AI Approach

Combating Advanced Persistent Threats: Challenges and Solutions

Reinforcement Learning Meets Network Intrusion Detection: A Transferable and Adaptable Framework for Anomaly Behavior Identification

Rule Generalisation in Intrusion Detection Systems Using Snort

Generative AI and Cognitive Computing-Driven Intrusion Detection System in Industrial CPS

ORCHID: Streaming Threat Detection over Versioned Provenance Graphs

An Interpretable Generalization Mechanism for Accurately Detecting Anomaly and Identifying Networking Intrusion Techniques

Characterizing the Modification Space of Signature IDS Rules

A Reinforcement Learning Approach for Dynamic Information Flow Tracking Games for Detecting Advanced Persistent Threats

Learn-IDS: Bridging Gaps between Datasets and Learning-Based Network Intrusion Detection