RTP-Q: a Reinforcement Learning System with an Active Exploration Planning Structure for Enhancing the Convergence Rate

Gang Zhao,Tatsumi, S.,Ruoying Sun
DOI: https://doi.org/10.1109/icsmc.1999.815597
1999-01-01
Abstract:In this paper, we propose an active exploring planning method in the prioritized sweeping reinforcement learning system to make an agent explore an environment efficiently. In order to plan an active exploration behavior, considering the estimate values feature of primitive learning system in our structure, we propose an exploration planning method that fully uses the learned model, plans an active exploration action and simplifies the setting of the parameters. The proposed system utilizes the learned model efficiently not only on computation of estimates, but also for realizing the active exploration to the environment. The comparison experiments of different methods on navigation tasks demonstrate the efficiency of the proposed method.
What problem does this paper attempt to address?