Sequential and Dynamic Constraint Contrastive Learning for Reinforcement Learning.

Weijie Shen,Lei Yuan,Junfu Huang,Songyi Gao,Yuyang Huang,Yang Yu
DOI: https://doi.org/10.1109/ijcnn52387.2021.9533984
2021-01-01
Abstract:Contrastive unsupervised learning gives remarkable promise for sample-efficiency improvement in reinforcement learning, especially for high-dimensional observations by extracting latent features from raw inputs. However, prior works scarcely take sequential information and the knowledge of dynamic transitions into consideration when constructing contrastive samples. In this paper, we propose Sequential and Dynamic constraint Contrastive Reinforcement Learning (SDCRL) to improve the sample efficiency in high-dimensional inputs (e.g., images) setting. We firstly construct a sequential contrastive module to extract latent features with sequential information from raw correlated image inputs. Furthermore, we add a dynamic transition classification module to extract the knowledge of state transitions. We validate the proposed method in low sample regime (few interactions). Our algorithm surpasses prior pixel-based approaches on complex tasks in Deepmind Control Suite and even achieves or exceeds the performance of the method that uses state-based features as inputs on 11 out of 15 tasks. In Atari2600 games, SDCRL also outperforms strong baselines and achieves state-of-the-art performance on 7 out of 26 games.
What problem does this paper attempt to address?