Automatic hierarchical approach of MAXQ based on action space partition

Qi WANG,Jin QIN
DOI: https://doi.org/10.11772/j.issn.1001-9081.2017.05.1357
2017-01-01
Abstract:Since a hierarchy of Markov Decision Process (MDP) need to be constructed manually in hierarchical reinforcement learning and some automatic hierarchical approachs based on state space produce unsatisfactory results in environment with not obvious subgoals,a new automatic hierarchical approach based on action space partition was proposed.Firstly,the set of actions was decomposed into some disjoint subsets through the state component of the action.Then,bottleneck actions were identified by analyzing the executable actions of the Agent in different states.Finally,based on the execution order of actions and bottleneck actions,the relationship of action subsets was determined and a hierarchy was constructed.Furthermore,the termination condition for sub-tasks in the MAXQ method was modified so that by using the hierarchical structure of the proposed algorithm the optimal strategy could be found through the MAXQ method.The experimental results show that the algorithm can automatically construct the hierarchical structure which was not affected by environmental change.Compared with the QLearning and Sarsa algorithms,the MAXQ method with the proposed hierarchy obtains the optimal strategy faster and gets higher returns.It verifies that the proposed algorithm can effectively construct the MAXQ hierarchy and make the optimal strategy more efficient.
What problem does this paper attempt to address?