A Design of Reward Function in Multi-Target Trajectory Recovery with Deep Reinforcement Learning

Liang He,Yanjie Chu,Chao Shen
DOI: https://doi.org/10.1109/itaic.2019.8785878
2020-01-01
Abstract:It attracts a lot of attention in the field of object trajectory detection that detectors always receive several geographical locations without any information about the targets, and furthermore it comes into a problem to use the geographical location information received by the sensors to reconstruct the trajectory of each targets as well as to distinguish the targets in each frame, which is called multi-target trajectory recovery and can be solved by the Deep Reinforcement Learning (DRL). A mathematically model of the direction and curvature of the target trajectory according to the peculiarity of trajectories is proposed. Then, a reward function based on Trajectory Osculating Circle (TOC) is designed based on this mathematical model. Firstly, the issue of the recovery of multi-target trajectories is introduced and it can be switched into a model which can be implemented by DRL. Secondly, a structure of DRL on this issue is come up with, and is tested with the proposed reward function. Finally, a mathematical derivation and physical interpretation of the proposed reward function is implemented. The experimental result shows that with the guidance of the TOC reward function, DRL can reverse the trajectory more effectively than the state-of-the-art clustering method, and the trace is corresponding with the actual trajectory.
What problem does this paper attempt to address?