Abstract:Learning to control unknown nonlinear dynamical systems is a fundamental problem in reinforcement learning and control theory. A commonly applied approach is to first explore the environment (exploration), learn an accurate model of it (system identification), and then compute an optimal controller with the minimum cost on this estimated system (policy optimization). While existing work has shown that it is possible to learn a uniformly good model of the system~\citep{mania2020active}, in practice, if we aim to learn a good controller with a low cost on the actual system, certain system parameters may be significantly more critical than others, and we therefore ought to focus our exploration on learning such parameters. In this work, we consider the setting of nonlinear dynamical systems and seek to formally quantify, in such settings, (a) which parameters are most relevant to learning a good controller, and (b) how we can best explore so as to minimize uncertainty in such parameters. Inspired by recent work in linear systems~\citep{wagenmaker2021task}, we show that minimizing the controller loss in nonlinear systems translates to estimating the system parameters in a particular, task-dependent metric. Motivated by this, we develop an algorithm able to efficiently explore the system to reduce uncertainty in this metric, and prove a lower bound showing that our approach learns a controller at a near-instance-optimal rate. Our algorithm relies on a general reduction from policy optimization to optimal experiment design in arbitrary systems, and may be of independent interest. We conclude with experiments demonstrating the effectiveness of our method in realistic nonlinear robotic systems.

Dynamic Programming-based Approximate Optimal Control for Model-Based Reinforcement Learning

Model-Based Robot Learning Control with Uncertainty Directed Exploration

A Learning-Based Optimal Tracking Controller for Continuous Linear Systems with Unknown Dynamics: Theory and Case Study

Model-Free Incremental Adaptive Dynamic Programming Based Approximate Robust Optimal Regulation

An Approximate Dynamic Programming Approach for Dual Stochastic Model Predictive Control

Actively Learning Reinforcement Learning: A Stochastic Optimal Control Approach

Data-based reinforcement learning approximate optimal control for an uncertain nonlinear system with control effectiveness faults

Stochastic Optimal Control as Approximate Input Inference

Deterministic Trajectory Optimization through Probabilistic Optimal Control

Optimal Exploration for Model-Based RL in Nonlinear Systems

Stochastic optimal control and estimation methods adapted to the noise characteristics of the sensorimotor system

Model-based reinforcement learning for infinite-horizon approximate optimal tracking

A Multilevel Approach for Stochastic Nonlinear Optimal Control

Average Cost Optimal Control of Stochastic Systems Using Reinforcement Learning

Optimal State Estimation Using Model-Free Reinforcement Learning

Estimation and Control Using Sampling-Based Bayesian Reinforcement Learning

Dynamic Event-Triggered Prescribed Performance Control for Partially Unknown Nonlinear System via Adaptive Dynamic Programming

Learning-Based Optimal Control with Performance Guarantees for Unknown Systems with Latent States

Model-Based Reinforcement Learning via Stochastic Hybrid Models

A Novel Approximate Dynamic Programming Structure for Optimal Control of Discrete-Time Time-Varying Nonlinear Systems

Approximate optimal and safe coordination of nonlinear second-order multirobot systems with model uncertainties