Intrinsic Motivation Exploration Via Self-Supervised Prediction in Reinforcement Learning

Zhiyou Yang,Hongfei Du,Yunhan Wu,Zhuotong Jiang,Hong Qu
DOI: https://doi.org/10.1109/docs63458.2024.10704242
2024-01-01
Abstract:In many real-world scenarios, extrinsic rewards available to the agent are exceedingly sparse. In such cases, curiosity can serve as an intrinsic reward signal, motivating the agent to explore its environment and acquire skills that could prove valuable in the future. Hence, exploration based on information novelty has brought great success in challenging reinforcement learning problems with sparse rewards. In this paper, we propose an exploration strategy, called Intrinsic Motivation Exploration (IME), that distills task-relevant information from raw state-action pairs using a self-supervised inverse dynamics model. Our exploration bonus is quantified as the compressiveness of raw state-action pairs with respect to the learned representation of the forward model. Through extensive experiments on continuous control benchmarks, we demonstrate that our IME method learns an effective exploration strategy by robustly measuring the novelty of raw state-action pairs in several control environments.
What problem does this paper attempt to address?