RDDRL: a recurrent deduction deep reinforcement learning model for multimodal vision-robot navigation

Zhenyu Li,Aiguo Zhou
DOI: https://doi.org/10.1007/s10489-023-04754-7
IF: 5.3
2023-07-07
Applied Intelligence
Abstract:Existing deep reinforcement learning-based mobile robot navigation relies largely on single-modal visual perception to perform local-scale navigation. However, multimodal visual fusion-based global navigation is still under technical exploration. Visual navigation necessitates that agents drive safely in structured, changing, and even unpredictable environments; otherwise, inappropriate operations may result in mission failure and even irreversible damage to life and property. We propose a recurrent deduction deep learning model (RDDRL) for multimodal vision-robot navigation to address these issues. We incorporate a recurrent reasoning mechanism (RRM) into the reinforcement learning model, which allows the agent to store memory, predict the future, and aid in policy learning. Specifically, the RRM first stores current observations and states by learning a parameterized environment model and then predicts future transitions. The RRM then performs a self-assessment on the predicted behavior and perceives the consequences of the current policy, producing a more reliable decision-making process. Furthermore, to obtain global-scale behavioral decision-making, information from scene recognition, semantic segmentation, and pose estimation are fused and used as partial observations of the RDDRL. A large number of simulated experiments based on CARLA scenarios, as well as test results in real-world scenarios, show that RDDRL outperforms state-of-the-art RL methods in terms of driving stability and safety. The results show that by training the agent, the collision rate in the global decision-making of the unmanned vehicle decreases from 0.2 % in the training state to 0.0 % in the test state.
computer science, artificial intelligence
What problem does this paper attempt to address?