Model-based Exploration Strategy to Accelerate Deterministic Strategy Algorithm Training

Xiaotong Zhao,Jingli Du,Zhihan Wang
DOI: https://doi.org/10.1109/ictai59109.2023.00114
2023-01-01
Abstract:Exploitation and Exploration in Reinforcement Learning (RL) are often in conflict, and a persistent problem in RL is finding effective methods for state space exploration. For the deterministic strategy algorithm, the exploration strategy directly affects the search process of the agent in the state space, which in turn affects the training efficiency. Model-based RL algorithms consider the environment model to be completely black-box, but when RL is applied to real-world problems, a portion of many environment models is often known. In this paper, we utilize a portion of known environment models in our exploration strategy to achieve more efficient exploration and thus accelerate the training of the agent. This is also closer to the actual situation, and often part of the environmental model we can acquire knowledge by way of a priori knowledge or learning the model. Our exploration strategy is applicable in a continuous action space environment. We propose three exploration methods that utilize partial or full environment models to achieve more efficient exploration. We use two GYM continuous action space environments and a 7-joints robotic arm environment to validate our exploration methods.
What problem does this paper attempt to address?