Abstract:One of the drawbacks of traditional reinforcement learning (RL) algorithms has been their poor sample efficiency. One approach to improve the sample efficiency is model-based RL. In our model-based RL algorithm, we learn a model of the environment, essentially its transition dynamics and reward function, use it to generate imaginary trajectories and backpropagate through them to update the policy, exploiting the differentiability of the model. Intuitively, learning more accurate models should lead to better performance. We focus on robotic systems undergoing rigid body motion without contacts. Recently, there has been growing interest in developing better deep neural network based dynamics models for physical systems, through better inductive biases. We compare two versions of our model-based RL algorithm, one which uses a standard deep neural network based dynamics model and the other which uses a much more accurate, physics-informed neural network based dynamics model. We show that, in model-based RL, model accuracy mainly matters in environments that are sensitive to initial conditions. In these environments, the physics-informed version of our algorithm achieves significantly better average-return and sample efficiency. In environments that are not sensitive to initial conditions, both versions of our algorithm achieve similar average-return, while the physics-informed version achieves better sample efficiency. We measure the sensitivity to initial conditions using the finite-time maximal Lyapunov exponent. We also show that, in challenging environments, where we need a lot of samples to learn, physics-informed model-based RL can achieve better average-return than state-of-the-art model-free RL algorithms such as Soft Actor-Critic, by generating accurate imaginary data.

Model-based inverse reinforcement learning for deterministic systems

Convergence Analysis of an Incremental Approach to Online Inverse Reinforcement Learning

Inverse Reinforcement Learning with Unknown Reward Model based on Structural Risk Minimization

Online inverse reinforcement learning with unknown disturbances

Look Before You Leap: Safe Model-Based Reinforcement Learning with Human Intervention

Online Observer-Based Inverse Reinforcement Learning

Model-based reinforcement learning for infinite-horizon approximate optimal tracking

A Bayesian Approach to Robust Inverse Reinforcement Learning

Differentiable Physics Models for Real-world Offline Model-based Reinforcement Learning

Safe Model-Based Reinforcement Learning for Systems with Parametric Uncertainties

Offline Model-Based Reinforcement Learning with Anti-Exploration

Physics-Informed Model-Based Reinforcement Learning

A Framework and Method for Online Inverse Reinforcement Learning

Model-Based Inverse Reinforcement Learning from Visual Demonstrations

Bayesian Inverse Reinforcement Learning for Non-Markovian Rewards

Human-in-the-Loop Behavior Modeling via an Integral Concurrent Adaptive Inverse Reinforcement Learning

Physics-informed Dyna-style model-based deep reinforcement learning for dynamic control

Robust Bayesian Inverse Reinforcement Learning with Sparse Behavior Noise

A survey on model-based reinforcement learning

An Analysis of Model-Based Reinforcement Learning From Abstracted Observations

Inverse Reinforcement Learning from Non-Stationary Learning Agents