Enhancing Visual Reinforcement Learning with State-Action Representation

Mengbei Yan,Jiafei Lyu,Xiu Li
DOI: https://doi.org/10.1016/j.knosys.2024.112487
2024-01-01
Abstract:Despite the remarkable progress made in visual reinforcement learning (RL) in recent years, sample inefficiency remains a major challenge. Many existing approaches attempt to address this by extracting better representations from raw images using techniques like data augmentation or introducing some auxiliary tasks. However, these methods overlook the environmental dynamic information embedded in the collected transitions, which can be crucial for efficient control. In this paper, we present STAR: State-Action Action Representation Learning, a simple yet effective approach for visual continuous control. STAR learns a joint state-action representation by modeling the dynamics of the environment in the latent space. By incorporating the learned joint state- action representation into the critic, STAR enhances the value estimation with latent dynamics information. We theoretically show that the value function can still converge to the optima when involving additional representation inputs. On various challenging visual continuous control tasks from DeepMind Control Suite, STAR achieves significant improvements in sample efficiency compared to strong baseline algorithms.
What problem does this paper attempt to address?