A non-myopic approach based on reinforcement learning for multiple moving targets search

Yifan Xu,Yuejin Tan,Zhenyu Lian,Renjie He
DOI: https://doi.org/10.1109/ICINFA.2010.5512235
2010-01-01
Abstract:Myopic information-based approaches maximizing information gain for single one observation opportunity are effective to search for multiple moving targets in ocean surveillance by space-based sensors. A non-myopic approach based on reinforcement learning is developed in order to maximize information gain for the long term. Reinforcement learning adjusts optimal control policy and learns system behaviors through trial-and-error experience from interactions with a dynamic environment. System states are characterized by the expected information gain, action-value functions are estimated by online SARAR (lambda) algorithm and parameterized control policy is approximated by neural networks. Finally, simulations show that non-myopic approach after sufficient training can provide better performance than myopic approach.
What problem does this paper attempt to address?