Abstract:Learning to control unknown nonlinear dynamical systems is a fundamental problem in reinforcement learning and control theory. A commonly applied approach is to first explore the environment (exploration), learn an accurate model of it (system identification), and then compute an optimal controller with the minimum cost on this estimated system (policy optimization). While existing work has shown that it is possible to learn a uniformly good model of the system~\citep{mania2020active}, in practice, if we aim to learn a good controller with a low cost on the actual system, certain system parameters may be significantly more critical than others, and we therefore ought to focus our exploration on learning such parameters. In this work, we consider the setting of nonlinear dynamical systems and seek to formally quantify, in such settings, (a) which parameters are most relevant to learning a good controller, and (b) how we can best explore so as to minimize uncertainty in such parameters. Inspired by recent work in linear systems~\citep{wagenmaker2021task}, we show that minimizing the controller loss in nonlinear systems translates to estimating the system parameters in a particular, task-dependent metric. Motivated by this, we develop an algorithm able to efficiently explore the system to reduce uncertainty in this metric, and prove a lower bound showing that our approach learns a controller at a near-instance-optimal rate. Our algorithm relies on a general reduction from policy optimization to optimal experiment design in arbitrary systems, and may be of independent interest. We conclude with experiments demonstrating the effectiveness of our method in realistic nonlinear robotic systems.

Suboptimal Reduced Control of Unknown Nonlinear Singularly Perturbed Systems Via Reinforcement Learning

Optimal Control for Constrained Discrete-Time Nonlinear Systems Based on Safe Reinforcement Learning.

Reinforcement Learning-Based Control for Nonlinear Discrete-Time Systems with Unknown Control Directions and Control Constraints

Reduced-dimensional reinforcement learning control using singular perturbation approximations

Reinforcement Learning-Based Control for a Class of Nonlinear Systems with Unknown Control Directions

Online Off-Policy Reinforcement Learning for Optimal Control of Unknown Nonlinear Systems Using Neural Networks

Online reinforcement learning control of unknown nonaffine nonlinear discrete time systems

Suboptimal control for nonlinear slow‐fast coupled systems using reinforcement learning and Takagi–Sugeno fuzzy methods

Reinforcement Learning Controller Design for Affine Nonlinear Discrete-Time Systems Using Online Approximators

Adaptive Optimal Output Regulation of Interconnected Singularly Perturbed Systems with Application to Power Systems

Robust Safe Reinforcement Learning Control of Unknown Continuous-Time Nonlinear Systems with State Constraints and Disturbances

Optimal control for continuous-time Markov jump singularly perturbed systems : A hybrid reinforcement learning scheme

Reinforcement learning‐based composite suboptimal control for Markov jump singularly perturbed systems with unknown dynamics

Online Reinforcement Learning-based Neural Network Controller Design for Affine Nonlinear Discrete-time Systems.

Control of Nonaffine Nonlinear Discrete-Time Systems Using Reinforcement-Learning-Based Linearly Parameterized Neural Networks

Hierarchical Sliding-Mode Surface-Based Adaptive Actor–Critic Optimal Control for Switched Nonlinear Systems With Unknown Perturbation

Stochastic Reinforcement Learning with Stability Guarantees for Control of Unknown Nonlinear Systems

Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems

Data-Driven Near-Optimal Control of Nonlinear Systems Over Finite Horizon

Optimal Exploration for Model-Based RL in Nonlinear Systems

Synergetic Learning Neuro-Control for Unknown Affine Nonlinear Systems With Asymptotic Stability Guarantees