Survey of Reinforcement Learning Based on Human Prior Knowledge

GUO Zijing,FENG Yanghe,YAO Chendie,XU Naifu
DOI: https://doi.org/10.1142/s1752890922300011
2022-01-01
Journal of Uncertain Systems
Abstract:At present, task planning is mainly solved by rules or operational research methods. The time complexity and space complexity of these methods increase exponentially with the growth of problem size. Thus, it is challenging to solve large-scale problems. Moreover, it is helpless to solve dynamic task planning problems. Reinforcement learning (RL) is often used to solve the dynamic planning problem of continuous decision making. RL agent constantly interacts with the environment to achieve the expected goal. However, the RL method meets challenges when handling the problem with ample state space. More sampling and exploration are needed to update the strategy gradient, and the convergence speed is slow. Humans ensure a quick start of learning by using prior knowledge, which reduces the exploration time of problems. Thus, we review the methods which combine human prior knowledge with RL through temporal node classification of human prior knowledge combined with RL. In this way, agents effectively reduce the sampling and exploration of the environment and eventually get the optimal strategy faster. Finally, we propose the development direction.
What problem does this paper attempt to address?