Abstract:Reinforcement learning (RL) models usually assume a stationary internal model structure of agents, which consists of fixed learning rules and environment representations. However, this assumption does not allow accounting for real problem solving by individuals who can exhibit irrational behaviors or hold inaccurate beliefs about their environment. In this work, we present a novel framework called Dynamic Structure Learning (DSL), which allows agents to adapt their learning rules and internal representations dynamically. This structural flexibility enables a deeper understanding of how individuals learn and adapt in real-world scenarios. The DSL framework reconstructs the most likely sequence of agent structures - sourced from a pool of learning rules and environment models - based on observed behaviors. The method provides insights into how an agent's internal structure model evolves as it transitions between different structures throughout the learning process. We applied our framework to study rat behavior in a maze task. Our results demonstrate that rats progressively refine their mental map of the maze, evolving from a suboptimal representation associated with repetitive errors to an optimal one that guides efficient navigation. Concurrently, their learning rules transition from heuristic-based to more rational approaches. These findings underscore the importance of both credit assignment and representation learning in complex behaviors. By going beyond simple reward-based associations, our research offers valuable insights into the cognitive mechanisms underlying decision-making in natural intelligence. DSL framework allows better understanding and modeling how individuals in real-world scenarios exhibit a level of adaptability that current AI systems have yet to achieve.

Learning Dynamics Model in Reinforcement Learning by Incorporating the Long Term Future

Learning Dynamics Models for Model Predictive Agents

Learning Parsimonious Dynamics for Generalization in Reinforcement Learning

Learning a World Model With Multitimescale Memory Augmentation

Investigating Compounding Prediction Errors in Learned Dynamics Models

Predicting Future Actions of Reinforcement Learning Agents

An End-to-end Deep Reinforcement Learning Approach for the Long-term Short-term Planning on the Frenet Space

Predictive Control Using Learned State Space Models via Rolling Horizon Evolution

Future Prediction Can be a Strong Evidence of Good History Representation in Partially Observable Environments

Deep Model-Based Reinforcement Learning for High-Dimensional Problems, a Survey

Continual Learning Using World Models for Pseudo-Rehearsal

Model-Based Reinforcement Learning via Latent-Space Collocation

Learning to Plan Long-Term for Language Modeling

Improving Long-Horizon Imitation Through Instruction Prediction

Safe Reinforcement Learning by Imagining the Near Future

Minimal Value-Equivalent Partial Models for Scalable and Robust Planning in Lifelong Reinforcement Learning

Reinforcement Learning under Latent Dynamics: Toward Statistical and Algorithmic Modularity

Inferring Time-Varying Internal Models of Agents Through Dynamic Structure Learning

Physics-Informed Model and Hybrid Planning for Efficient Dyna-Style Reinforcement Learning

Model-Based Reinforcement Learning Via Imagination with Derived Memory.

Learning World Models With Hierarchical Temporal Abstractions: A Probabilistic Perspective