Clustered Reinforcement Learning

Xiao Ma,Shen-Yi Zhao,Zhao-Heng Yin,Wu-Jun Li
DOI: https://doi.org/10.1007/s11704-024-3194-1
IF: 2.6688
2024-01-01
Frontiers of Computer Science
Abstract:Exploration strategy design is a challenging problem in reinforcement learning (RL), especially when the environment contains a large state space or sparse rewards. During exploration, the agent tries to discover unexplored (novel) areas or high reward (quality) areas. Most existing methods perform exploration by only utilizing the novelty of states. The novelty and quality in the neighboring area of the current state have not been well utilized to simultaneously guide the agent’s exploration. To address this problem, this paper proposes a novel RL framework, called clustered reinforcement learning (CRL), for efficient exploration in RL. CRL adopts clustering to divide the collected states into several clusters, based on which a bonus reward reflecting both novelty and quality in the neighboring area (cluster) of the current state is given to the agent. CRL leverages these bonus rewards to guide the agent to perform efficient exploration. Moreover, CRL can be combined with existing exploration strategies to improve their performance, as the bonus rewards employed by these existing exploration strategies solely capture the novelty of states. Experiments on four continuous control tasks and six hard-exploration Atari-2600 games show that our method can outperform other state-of-the-art methods to achieve the best performance.
What problem does this paper attempt to address?