UAV Target Following in Complex Occluded Environments with Adaptive Multi-Modal Fusion

Xu Lele,Wang Teng,Cai Wenzhe,Sun Changyin
DOI: https://doi.org/10.1007/s10489-022-04317-2
IF: 5.3
2022-01-01
Applied Intelligence
Abstract:Nowadays, deep reinforcement learning (DRL) has made remarkable achievements in unmanned aerial vehicle (UAV) target following. However, current DRL-based solutions only exploit appearance features to recognize and follow the target, and thus suffer from loss of the target in complex occluded environments. To that end, a novel target following solution based on adaptive appearance-motion feature fusion is proposed. First, we follow prior works to exploit convolutional neural network to extract appearance features of the target from the observation image captured from a down-looking camera. Meanwhile, we innovatively leverage action sequences of UAV to explicitly encode the motion features of the target. An attention module is subsequently introduced to adaptively select relevant useful features which serve as environment states and are fed into the decision module to produce the motion action of UAV. The whole network is trained using Deep Q-Network to learn the motion policy from observation in an end-to-end manner. We perform simulation experiments on Virtual Robot Experimentation Platform. Extensive experimental results demonstrate that: (1) Our proposed method achieves higher tracking accuracy and longer tracking time in various environments compared to state-of-the-art approaches; (2) The learned DRL policy could be well generalized to unseen environments.
What problem does this paper attempt to address?