Abstract:Nonlinear optimal control problems are often solved with numerical methods that require knowledge of system's dynamics which may be difficult to infer, and that carry a large computational cost associated with iterative calculations. We present a novel neurobiologically inspired hierarchical learning framework, Reinforcement Learning Optimal Control, which operates on two levels of abstraction and utilises a reduced number of controllers to solve nonlinear systems with unknown dynamics in continuous state and action spaces. Our approach is inspired by research at two levels of abstraction: first, at the level of limb coordination human behaviour is explained by linear optimal feedback control theory. Second, in cognitive tasks involving learning symbolic level action selection, humans learn such problems using model-free and model-based reinforcement learning algorithms. We propose that combining these two levels of abstraction leads to a fast global solution of nonlinear control problems using reduced number of controllers. Our framework learns the local task dynamics from naive experience and forms locally optimal infinite horizon Linear Quadratic Regulators which produce continuous low-level control. A top-level reinforcement learner uses the controllers as actions and learns how to best combine them in state space while maximising a long-term reward. A single optimal control objective function drives high-level symbolic learning by providing training signals on desirability of each selected controller. We show that a small number of locally optimal linear controllers are able to solve global nonlinear control problems with unknown dynamics when combined with a reinforcement learner in this hierarchical framework. Our algorithm competes in terms of computational cost and solution quality with sophisticated control algorithms and we illustrate this with solutions to benchmark problems.

Discrete-Time Nonlinear Optimal Control Using Multi-Step Reinforcement Learning

Optimal Control for Constrained Discrete-Time Nonlinear Systems Based on Safe Reinforcement Learning.

Neural Network Based Multi-Step Predictive Control for Nonlinear Systems

Reinforcement Learning-Based Control for Nonlinear Discrete-Time Systems with Unknown Control Directions and Control Constraints

Reinforcement Learning-Based Control for a Class of Nonlinear Systems with Unknown Control Directions

Optimal Tracking Control of Nonlinear Multiagent Systems Using Internal Reinforce Q-Learning

Control of Nonaffine Nonlinear Discrete-Time Systems Using Reinforcement-Learning-Based Linearly Parameterized Neural Networks

Online Off-Policy Reinforcement Learning for Optimal Control of Unknown Nonlinear Systems Using Neural Networks

Model-free Adaptive Dynamic Programming for Optimal Control of Discrete-time Affine Nonlinear System

RLOC: Neurobiologically Inspired Hierarchical Reinforcement Learning Algorithm for Continuous Control of Nonlinear Dynamical Systems

Online Reinforcement Learning-based Neural Network Controller Design for Affine Nonlinear Discrete-time Systems.

Data-Driven Near-Optimal Control of Nonlinear Systems Over Finite Horizon

Reinforcement Learning for a Discrete-Time Linear-Quadratic Control Problem with an Application

Robust Safe Reinforcement Learning Control of Unknown Continuous-Time Nonlinear Systems with State Constraints and Disturbances

Learning-based adaptive optimal control of linear time-delay systems: A value iteration approach

Policy Iteration Reinforcement Learning Method for Continuous-Time Linear-Quadratic Mean-Field Control Problems

Model-Based Safe Reinforcement Learning With Time-Varying Constraints: Applications to Intelligent Vehicles

Mixed Reinforcement Learning for Efficient Policy Optimization in Stochastic Environments

Physics‐informed reinforcement learning for optimal control of nonlinear systems

Model-Based Safe Reinforcement Learning with Time-Varying State and Control Constraints: An Application to Intelligent Vehicles

Reinforcement Learning Based on Real-Time Iteration NMPC