Optimal Policy Replay: A Simple Method to Reduce Catastrophic Forgetting in Target Incremental Visual Navigation

Xinting Li,Shizhou Zhang,Yue LU,Kerry Dang,Lingyan Ran,Peng Wang,Yanning Zhang
DOI: https://doi.org/10.1109/cac59555.2023.10450433
2023-01-01
Abstract:Visual navigation is a critical task in robotics and artificial intelligence. In recent years, reinforcement learning-based approaches have gained popularity for visual navigation. However, existing methods lack flexibility in learning multiple navigation targets and suffer from catastrophic forgetting. To address these challenges, we propose a novel paradigm called “target incremental visual navigation” and introduce a method called Optimal Policy Replay (OPR). Target incremental visual navigation aims to study the performance of visual navigation in continuous learning of navigation targets. OPR enables continuous learning of navigation targets without the need for relearning all targets. Our method divides the learning process into on-policy and off-policy stages and stores only the optimal experiences in memory. Experimental results show that OPR effectively alleviates catastrophic forgetting and achieves good performance with a small memory size.
What problem does this paper attempt to address?