Abstract:In this paper we present a world model, which learns causal features using the invariance principle. In particular, we use contrastive unsupervised learning to learn the invariant causal features, which enforces invariance across augmentations of irrelevant parts or styles of the observation. The world-model-based reinforcement learning methods independently optimize representation learning and the policy. Thus naive contrastive loss implementation collapses due to a lack of supervisory signals to the representation learning module. We propose an intervention invariant auxiliary task to mitigate this issue. Specifically, we utilize depth prediction to explicitly enforce the invariance and use data augmentation as style intervention on the RGB observation space. Our design leverages unsupervised representation learning to learn the world model with invariant causal features. Our proposed method significantly outperforms current state-of-the-art model-based and model-free reinforcement learning methods on out-of-distribution point navigation tasks on the iGibson dataset. Moreover, our proposed model excels at the sim-to-real transfer of our perception learning module. Finally, we evaluate our approach on the DeepMind control suite and enforce invariance only implicitly since depth is not available. Nevertheless, our proposed model performs on par with the state-of-the-art counterpart.

What problem does this paper attempt to address?

The problems that this paper attempts to solve mainly focus on the following aspects: 1. **Feature learning and policy optimization in model - based reinforcement learning**: Existing model - based reinforcement learning (MBRL) methods perform feature learning and control policy optimization independently. This may cause the feature learning module to fail due to a lack of sufficient supervision signals. Therefore, how to effectively combine these two to improve the effectiveness of MBRL is a key issue. 2. **The problem of insufficient generalization ability**: When the environment changes or there are out - of - distribution (OoD) situations, the performance of existing methods often drops significantly. To meet this challenge, the author proposes a new world model, which can enhance the generalization ability of the model through causal invariant features. 3. **The sim - to - real transfer problem**: Many models trained in simulated environments perform poorly in practical applications because they fail to adapt well to the complexity and changes in the real world. For this reason, this paper explores how to use causal invariant features to improve the transfer performance of the perception learning module. Specifically, the author introduces a world model with invariant causal features (WMC) and uses contrastive unsupervised learning to extract these features. In addition, they also design an intervention - invariant auxiliary task, such as depth prediction, to ensure that the model can learn geometric features independent of style, thereby improving its robustness and generalization ability. In this way, WMC can not only outperform the current state - of - the - art model - based and model - free reinforcement learning methods in point navigation tasks, but also perform excellently in sim - to - real perception learning transfer. Experimental results show that the proposed model has better OoD generalization ability and sim - to - real transfer performance on the iGibson dataset than other baseline models.

Contrastive Unsupervised Learning of World Model with Invariant Causal Features

ReCoRe: Regularized Contrastive Representation Learning of World Model

Vision-and-Language Navigation via Causal Learning

Integrating Contrastive Learning with Dynamic Models for Reinforcement Learning from Images

Generalization properties of contrastive world models

What Should Not Be Contrastive in Contrastive Learning

Latent Causal Invariant Model

Invariant Feature Learning Based on Causal Inference from Heterogeneous Environments

Graph Contrastive Invariant Learning from the Causal Perspective

Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning

Causality Pursuit from Heterogeneous Environments via Neural Adversarial Invariance Learning

Causality-based Cross-Modal Representation Learning for Vision-and-Language Navigation

Regularizing Adversarial Imitation Learning Using Causal Invariance

CIPER: Combining Invariant and Equivariant Representations Using Contrastive and Predictive Learning

Contrastive Self-supervised Representation Learning Using Synthetic Data

Learning a World Model With Multitimescale Memory Augmentation

Learning Invariant Representations with a Nonparametric Nadaraya-Watson Head

Learning Invariant Representations for Reinforcement Learning without Reconstruction

Revealing Multimodal Contrastive Representation Learning through Latent Partial Causal Models

Contrastive Learning of Structured World Models

Learning 1D Causal Visual Representation with De-focus Attention Networks