Jamming Policy Generation via Heuristic Programming Reinforcement Learning

Yujie Zhang,Weibo Huo,Yulin Huang,Cui Zhang,Jifang Pei,Yin Zhang,Jianyu Yang
DOI: https://doi.org/10.1109/TAES.2023.3312231
IF: 3.491
2023-01-01
IEEE Transactions on Aerospace and Electronic Systems
Abstract:Radar countermeasure (RCM) is increasingly important in modern warfare. Fast and accurate jamming decision made by RCM systems can provide timely electronic protection for important or high-value targets. To quickly find a jamming policy against the multifunctional radar, a jamming policy generation scheme via heuristic programming reinforcement learning is proposed in this article, where the jamming strategy can be dynamically adjusted using interactive self-learning. First, the relationships between radar operation modes and jamming modes are investigated under self-defense jamming in an air-to-air RCM scenario, which is modeled as a Markov decision process. Especially, the potential energy is employed to construct a heuristic reward function that avoids the jammer getting stuck in the local jamming action space. In addition, a model relating to radar state transition is derived using the real experience from the jammer-radar interaction, which narrows the state-action space. The simulated experience programmed by this model can accelerate to update the value function. The proposed method decreases sample complexity while increasing sample utilization, reducing the time for jamming decision-making. Numerical experimental results demonstrate that the proposed method has superior jamming decision performance compared with prevailing methods, effectively improving the real-time performance and interference effectiveness of RCM systems.
What problem does this paper attempt to address?