Abstract:Nonlinear optimal control problems are often solved with numerical methods that require knowledge of system's dynamics which may be difficult to infer, and that carry a large computational cost associated with iterative calculations. We present a novel neurobiologically inspired hierarchical learning framework, Reinforcement Learning Optimal Control, which operates on two levels of abstraction and utilises a reduced number of controllers to solve nonlinear systems with unknown dynamics in continuous state and action spaces. Our approach is inspired by research at two levels of abstraction: first, at the level of limb coordination human behaviour is explained by linear optimal feedback control theory. Second, in cognitive tasks involving learning symbolic level action selection, humans learn such problems using model-free and model-based reinforcement learning algorithms. We propose that combining these two levels of abstraction leads to a fast global solution of nonlinear control problems using reduced number of controllers. Our framework learns the local task dynamics from naive experience and forms locally optimal infinite horizon Linear Quadratic Regulators which produce continuous low-level control. A top-level reinforcement learner uses the controllers as actions and learns how to best combine them in state space while maximising a long-term reward. A single optimal control objective function drives high-level symbolic learning by providing training signals on desirability of each selected controller. We show that a small number of locally optimal linear controllers are able to solve global nonlinear control problems with unknown dynamics when combined with a reinforcement learner in this hierarchical framework. Our algorithm competes in terms of computational cost and solution quality with sophisticated control algorithms and we illustrate this with solutions to benchmark problems.

Hierarchical Optimal Synchronization for Linear Systems Via Reinforcement Learning: A Stackelberg–Nash Game Perspective

Multiplayer Stackelberg-Nash Game for Nonlinear System via Value Iteration-Based Integral Reinforcement Learning

Optimal Control for Constrained Discrete-Time Nonlinear Systems Based on Safe Reinforcement Learning.

Two-Player Stackelberg Game for Linear System Via Value Iteration Algorithm

Optimal Synchronization Control of Multiagent Systems with Input Saturation Via Off-Policy Reinforcement Learning.

Optimal Leader-Following Consensus Control of Multi-Agent Systems: A Neural Network Based Graphical Game Approach

Optimal model-free output synchronization of heterogeneous systems using off-policy reinforcement learning

RLOC: Neurobiologically Inspired Hierarchical Reinforcement Learning Algorithm for Continuous Control of Nonlinear Dynamical Systems

Data-Efficient Off-Policy Learning for Distributed Optimal Tracking Control of HMAS with Unidentified Exosystem Dynamics.

Reinforcement Learning-Based Control for Nonlinear Discrete-Time Systems with Unknown Control Directions and Control Constraints

Optimal Control and Filtering for Hierarchical Decision Problems with $H_{\infty }$ Constraint based on Stackelberg Strategy

Online Reinforcement Learning-based Neural Network Controller Design for Affine Nonlinear Discrete-time Systems.

Reinforcement Learning-Based Unknown Reference Tracking Control of HMASs with Nonidentical Communication Delays

Inverse optimal stabilization of cooperative control in networked multi-agent systems

Game-Based Backstepping Design for Strict-Feedback Nonlinear Multi-Agent Systems Based on Reinforcement Learning

Sliding-mode surface-based approximate optimal control for nonlinear multiplayer Stackelberg-Nash games via adaptive dynamic programming

Optimal consensus control for unknown second-order multi-agent systems: Using model-free reinforcement learning method

Hierarchical Control of Multi-Agent Systems using Online Reinforcement Learning

Multiplayer hierarchical decision‐making for discrete‐time nonlinear networks of service via value iteration adaptive dynamic programming

Actively Learning Reinforcement Learning: A Stochastic Optimal Control Approach

Data-Based Optimal Consensus Control for Multiagent Systems With Policy Gradient Reinforcement Learning