Shared Autonomy Based on Human-in-the-loop Reinforcement Learning with Policy Constraints

Ming Li,Yu Kang,Yun-Bo Zhao,Jin Zhu,Shiyi You
DOI: https://doi.org/10.23919/ccc55666.2022.9902295
2022-01-01
Abstract:In shared autonomous systems, humans and agents cooperate to complete tasks. Since reinforcement learning enables agents to obtain good policies through trial and error without knowing the dynamic model of the environment, it has been well applied in shared autonomous systems. After inferring the target from human inputs, agents trained by RL can accurately act accordingly. However, existing methods of this kind have big problems: the training of reinforcement learning algorithms require lots of exploration, which is time-consuming, lack of security guarantee and likely to cause great losses in the training process. Moreover, most of shared control methods are human-oriented, and do not consider the situation that humans may make wrong actions. In view of the above problems, this paper proposes human-in-the-loop reinforcement learning with policy constraints. In the training process, human prior knowledge is used to constrain the exploration of agents to achieve fast and efficient learning. In the process of testing we incorporate policy constraints in the arbitration to avoid serious consequences caused by human mistakes.
What problem does this paper attempt to address?