Abstract:In order to improve the efficiency and adaptability of cognitive radar jamming decision-making, a fusion algorithm (Ant-QL) based on ant colony and Q-Learning is proposed in this paper. The algorithm does not rely on a priori information and enhances adaptability through real-time interactions between the jammer and the target radar. At the same time, it can be applied to single jammer and multiple jammer countermeasure scenarios with high jamming effects. First, traditional Q-Learning and DQN algorithms are discussed, and a radar jamming decision-making model is built for the simulation verification of each algorithm. Then, an improved Q-Learning algorithm is proposed to address the shortcomings of both algorithms. By introducing the pheromone mechanism of ant colony algorithms in Q-Learning and using the ε-greedy algorithm to balance the contradictory relationship between exploration and exploitation, the algorithm greatly avoids falling into a local optimum, thus accelerating the convergence speed of the algorithm with good stability and robustness in the convergence process. In order to better adapt to the cluster countermeasure environment in future battlefields, the algorithm and model are extended to cluster cooperative jamming decision-making. We map each jammer in the cluster to an intelligent ant searching for the optimal path, and multiple jammers interact with each other to obtain information. During the process of confrontation, the method greatly improves the convergence speed and stability and reduces the need for hardware and power resources of the jammer. Assuming that the number of jammers is three, the experimental simulation results of the convergence speed of the Ant-QL algorithm improve by 85.4%, 80.56% and 72% compared with the Q-Learning, DQN and improved Q-Learning algorithms, respectively. During the convergence process, the Ant-QL algorithm is very stable and efficient, and the algorithm complexity is low. After the algorithms converge, the average response times of the four algorithms are 6.99 × 10−4 s, 2.234 × 10−3 s, 2.21 × 10−4 s and 1.7 × 10−4 s, respectively. The results show that the improved Q-Learning algorithm and Ant-QL algorithm also have more advantages in terms of average response time after convergence.

Jamming Strategy Optimization through Dual Q-Learning Model against Adaptive Radar

Joint Optimization of Jamming Type Selection and Power Control for Countering Multi-function Radar Based on Deep Reinforcement Learning

Multifunctional Radar Cognitive Jamming Decision Based on Dueling Double Deep Q-Network

GA-Dueling DQN Jamming Decision-Making Method for Intra-Pulse Frequency Agile Radar

A Cognitive Electronic Jamming Decision-Making Method Based on Q-Learning and Ant Colony Fusion Algorithm

Radar Anti-Jamming Decision-Making Method Based on DDPG-MADDPG Algorithm

Cooperative Jamming Resource Allocation with Joint Multi-Domain Information Using Evolutionary Reinforcement Learning

A Radar Anti-Jamming Strategy Based on Game Theory With Temporal Constraints

Performance Analysis of Deep Reinforcement Learning-Based Intelligent Cooperative Jamming Method Confronting Multi-functional Networked Radar

An Intelligent Strategy Decision Method for Collaborative Jamming Based On Hierarchical Multi-Agent Reinforcement Learning

Reinforcement Learning based Anti-jamming Frequency Hopping Strategies Design for Cognitive Radar

An Optimization Method for Collaborative Radar Antijamming Based on Multi-Agent Reinforcement Learning

Improving anti-jamming decision-making strategies for cognitive radar via multi-agent deep reinforcement learning

Design of anti‐jamming decision‐making for cognitive radar

Adaptation of Frequency Hopping Interval for Radar Anti-Jamming Based on Reinforcement Learning

Frequency Agile Anti-Interference Technology Based on Reinforcement Learning Using Long Short-Term Memory and Multi-Layer Historical Information Observation

Reinforcement Learning-Based Anti-Jamming in Networked UAV Radar Systems

Radar and Jammer Intelligent Game under Jamming Power Dynamic Allocation

Radar-Jamming Decision-Making Based on Improved Q-Learning and FPGA Hardware Implementation

A Dynamic Game Strategy for Radar Screening Pulsewidth Allocation Against Jamming Using Reinforcement Learning

Avoiding Jammers: A Reinforcement Learning Approach