Abstract:The focus of this paper is on the deep reinforcement learning (DRL) based cognitive radar waveform optimization problem, by utilizing the prior knowledge of the environment and adaptively adjusting the system parameters. We first design deep reinforcement learning-based waveform optimization framework, where agent is radar, state is entropy of environment state and action is transmit waveform. Based on the fact that the continuous changing in cognitive radar tracking system, resulting in infinite states of environment and targets, we design a new deep Q-network to map the state-action pair to its Q-values. After training, radar in DRL system will obtain a policy with which it can select the optimal parameter of waveform to transmit according to the entropy of environment state, which is the comentropy of the posterior probability of state. Simulation results illustrate that by using the proposed approach, the precision of target tracking can be improved clearly.

A Cognitive Radar Waveform Optimization Approach Based on Deep Reinforcement Learning