Network defense decision-making based on deep reinforcement learning and dynamic game theory

Huang Wanwei,Yuan Bo,Wang Sunan,Ding Yi,Li Yuhua
DOI: https://doi.org/10.23919/jcc.ja.2022-0401
2024-10-01
China Communications
Abstract:Existing researches on cyber attack-defense analysis have typically adopted stochastic game theory to model the problem for solutions, but the assumption of complete rationality is used in modeling, ignoring the information opacity in practical attack and defense scenarios, and the model and method lack accuracy. To such problem, we investigate network defense policy methods under finite rationality constraints and propose network defense policy selection algorithm based on deep reinforcement learning. Based on graph theoretical methods, we transform the decision-making problem into a path optimization problem, and use a compression method based on service node to map the network state. On this basis, we improve the A3C algorithm and design the Defense-A3C defense policy selection algorithm with online learning capability. The experimental results show that the model and method proposed in this paper can stably converge to a better network state after training, which is faster and more stable than the original A3C algorithm. Compared with the existing typical approaches, Defense-A3C is verified its advancement.
telecommunications
What problem does this paper attempt to address?