Abstract:In recent years, penetration testing (pen-testing) has emerged as a crucial process for evaluating the security level of network infrastructures by simulating real-world cyber-attacks. Automating pen-testing through reinforcement learning (RL) facilitates more frequent assessments, minimizes human effort, and enhances scalability. However, real-world pen-testing tasks often involve incomplete knowledge of the target network system. Effectively managing the intrinsic uncertainties via partially observable Markov decision processes (POMDPs) constitutes a persistent challenge within the realm of pen-testing. Furthermore, RL agents are compelled to formulate intricate strategies to contend with the challenges posed by partially observable environments, thereby engendering augmented computational and temporal expenditures. To address these issues, this study introduces EPPTA (efficient POMDP-driven penetration testing agent), an agent built on an asynchronous RL framework, designed for conducting pen-testing tasks within partially observable environments. We incorporate an implicit belief module in EPPTA, grounded on the belief update formula of the traditional POMDP model, which represents the agent's probabilistic estimation of the current environment state. Furthermore, by integrating the algorithm with the high-performance RL framework, sample factory, EPPTA significantly reduces convergence time compared to existing pen-testing methods, resulting in an approximately 20-fold acceleration. Empirical results across various pen-testing scenarios validate EPPTA's superior task reward performance and enhanced scalability, providing substantial support for efficient and advanced evaluation of network infrastructure security. The article introduces EPPTA, a reinforcement learning framework designed for penetration testing, and assesses its performance across diverse network configurations. EPPTA incorporates a belief module, augmenting its capacity to handle partially observable security challenges. EPPTA outperforms other methods in terms of convergence times, especially in larger and more complex scenarios, showcasing its scalability and adaptability to evolving security challenges.image

A Hierarchical Deep Reinforcement Learning Model with Expert Prior Knowledge for Intelligent Penetration Testing

Hierarchical reinforcement learning for efficient and effective automated penetration testing of large networks

INNES: An intelligent network penetration testing model based on deep reinforcement learning

An Automated Penetration Testing Framework Based on Hierarchical Reinforcement Learning

Knowledge-Informed Auto-Penetration Testing Based on Reinforcement Learning with Reward Machine

EPPTA: Efficient Partially Observable Reinforcement Learning Agent for Penetration Testing Applications

DynPen: Automated Penetration Testing in Dynamic Network Scenarios Using Deep Reinforcement Learning

Autonomous Penetration Testing Based on Improved Deep Q-Network

Behaviour-diverse automatic penetration testing: a coverage-based deep reinforcement learning approach

GAIL-PT: An Intelligent Penetration Testing Framework with Generative Adversarial Imitation Learning

SetTron: Towards Better Generalisation in Penetration Testing with Reinforcement Learning

Safe Exploration in Wireless Security: A Safe Reinforcement Learning Algorithm With Hierarchical Structure

A Layered Reference Model for Penetration Testing with Reinforcement Learning and Attack Graphs

Evaluation of Reinforcement Learning for Autonomous Penetration Testing using A3C, Q-learning and DQN

Raijū: Reinforcement Learning-Guided Post-Exploitation for Automating Security Assessment of Network Systems

Discover the Hidden Attack Path in Multi-domain Cyberspace Based on Reinforcement Learning

Application Study on the Reinforcement Learning Strategies in the Network Awareness Risk Perception and Prevention

Reinforcement learning-based autonomous attacker to uncover computer network vulnerabilities

An Intelligent Penetration Testing Method Using Human Feedback

A Survey for Deep Reinforcement Learning Based Network Intrusion Detection

Applying Action Masking and Curriculum Learning Techniques to Improve Data Efficiency and Overall Performance in Operational Technology Cyber Security using Reinforcement Learning