Abstract:In real-world scenarios, the application of reinforcement learning is significantly challenged by complex non-stationarity. Most existing methods attempt to model changes in the environment explicitly, often requiring impractical prior knowledge of environments. In this paper, we propose a new perspective, positing that non-stationarity can propagate and accumulate through complex causal relationships during state transitions, thereby compounding its sophistication and affecting policy learning. We believe that this challenge can be more effectively addressed by implicitly tracing the causal origin of non-stationarity. To this end, we introduce the Causal-Origin REPresentation (COREP) algorithm. COREP primarily employs a guided updating mechanism to learn a stable graph representation for the state, termed as causal-origin representation. By leveraging this representation, the learned policy exhibits impressive resilience to non-stationarity. We supplement our approach with a theoretical analysis grounded in the causal interpretation for non-stationary reinforcement learning, advocating for the validity of the causal-origin representation. Experimental results further demonstrate the superior performance of COREP over existing methods in tackling non-stationarity problems.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the problem of complex non - stationarity faced by Reinforcement Learning (RL) in the real world. Specifically: 1. **Challenges of non - stationarity**: - In practical application scenarios, the dynamic characteristics of the environment change over time, which makes it difficult for traditional RL algorithms to adapt and maintain high performance. - Existing methods usually attempt to explicitly model environmental changes, but this often requires unrealistic prior knowledge of the environment. 2. **Propagation and accumulation of causal relationships**: - The paper proposes that non - stationarity can be propagated and accumulated through complex causal relationships in state transitions, thereby exacerbating its complexity and affecting policy learning. - The propagation of such causal relationships makes it very difficult to directly deal with non - stationarity. 3. **Implicitly tracing the causal origin**: - To solve this problem, the paper introduces a new perspective: more effectively应对挑战 by implicitly tracing the causal origin of non - stationarity. - Specifically, the authors propose the Causal - Origin Representation (COREP) algorithm, which mainly learns a stable state - graph representation, called causal - origin representation, through a guided update mechanism. 4. **Theoretical support and experimental verification**: - The paper provides a theoretical analysis based on causal explanations to prove the effectiveness of the causal - origin representation. - Experimental results further show that COREP is superior to existing methods in dealing with non - stationarity problems. ### Core contributions - **Causal explanation**: Provides a new causal - explanation framework for understanding the role of non - stationarity in RL. - **Modular algorithm design**: Designs a modular algorithm that can be easily integrated into existing RL algorithms. - **Theoretical and empirical support**: Provides theoretical analysis and experimental results to prove the effectiveness and superiority of the algorithm. Through these contributions, the paper provides a novel and effective solution to the non - stationarity problem in RL.

Tackling Non-Stationarity in Reinforcement Learning via Causal-Origin Representation

Spatial-Temporal Causality Modeling for Industrial Processes with a Knowledge-Data Guided Reinforcement Learning

CORE: Towards Scalable and Efficient Causal Discovery with Reinforcement Learning

Learning by Doing: An Online Causal Reinforcement Learning Framework with Causal-Aware Policy

Towards Generalizable Reinforcement Learning via Causality-Guided Self-Adaptive Representations

CIER: A Novel Experience Replay Approach with Causal Inference in Deep Reinforcement Learning

Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning

Learning Causal State Representations of Partially Observable Environments

Causal Representation Learning from Multiple Distributions: A General Setting

Causal Reinforcement Learning using Observational and Interventional Data

Causal Coordinated Concurrent Reinforcement Learning

Explainable Reinforcement Learning via a Causal World Model

Offline Reinforcement Learning with Causal Structured World Models

Causal Reasoning from Meta-reinforcement Learning

Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning

Language-Based Causal Representation Learning

Causal Reinforcement Learning: A Survey

Bridging State and History Representations: Understanding Self-Predictive RL

Online Reinforcement Learning in Non-Stationary Context-Driven Environments

Explainable Reinforcement Learning Through a Causal Lens

Interpretable Imitation Learning with Dynamic Causal Relations