Option-based Multi-agent Exploration

Xuwei Song,Lipeng Wan,Zeyang Liu,Xingyu Chen,Xuguang Lan
DOI: https://doi.org/10.1109/cyber55403.2022.9907622
2022-01-01
Abstract:Effective exploration is essential to cooperative multi-agent reinforcement learning (MARL). However, existing exploration MARL algorithms remain two challenges: enormous exploration space, and partial observability constraints. To address these challenges, we propose a method called option-based multiagent exploration (OMAE): we introduce the concept of option to reduce the number of decisions, where options are defined as policies with a termination condition. Option-based exploration improves learning efficiency because the option space is much smaller than the original policy space. We use a dual-policy framework to overcome partial observability constraints where the global state is not available in execution. Our framework separates the exploration and the exploitation policies to ensure that the exploitation policy is accessible to the state information without explicitly taking the options as input. We further introduce a likelihood estimation to solve the distribution shift problem between two policies. Experimental results show that the OMAE improves the coordinated ability in comparison with the baseline methods in most of the tasks in the StarCraftII environment(SMAC).
What problem does this paper attempt to address?