Abstract:SIAM Journal on Scientific Computing, Volume 46, Issue 5, Page C535-C556, October 2024. We present a neural network approach for approximating the value function of high-dimensional stochastic control problems. Our training process simultaneously updates our value function estimate and identifies the part of the state space likely to be visited by optimal trajectories. Our approach leverages insights from optimal control theory and the fundamental relation between semilinear parabolic partial differential equations and forward-backward stochastic differential equations. To focus the sampling on relevant states during neural network training, we use the stochastic Pontryagin maximum principle (PMP) to obtain the optimal controls for the current value function estimate. By design, our approach coincides with the method of characteristics for the nonviscous Hamilton–Jacobi–Bellman equation arising in deterministic control problems. Our training loss consists of a weighted sum of the objective functional of the control problem and penalty terms that enforce the HJB equations along the sampled trajectories. Importantly, training is unsupervised in that it does not require solutions of the control problem. Our numerical experiments highlight our scheme's ability to identify the relevant parts of the state space and produce meaningful value estimates. Using a two-dimensional model problem, we demonstrate the importance of the stochastic PMP to inform the sampling and compare it to a finite element approach. With a nonlinear control affine quadcopter example, we illustrate that our approach can handle complicated dynamics. For a 100-dimensional benchmark problem, we demonstrate that our approach improves accuracy and time-to-solution, and, via a modification, we show the wider applicability of our scheme. Reproducibility of computational results.This paper has been awarded the "SIAM Reproducibility Badge: Code and data available" as recognition that the authors have followed reproducibility principles valued by SISC and the scientific computing community. Code and data that allow readers to reproduce the results in this paper are available at https://github.com/EmoryMLIP/NeuralSOC and in the supplementary material (NeuralSOC-main.zip [ 29.9MB]).

Initial Value Problem Enhanced Sampling for Closed-Loop Optimal Control Design with Deep Neural Networks

Exponential Stabilization for Sampled-Data Neural-Network-based Control Systems.

Online Reinforcement Learning Neural Network Controller Design for Nanomanipulation

Learning-Based Neural Dynamic Surface Predictive Control for MMC

Neural network optimal feedback control with enhanced closed loop stability

Online Reinforcement Learning-based Neural Network Controller Design for Affine Nonlinear Discrete-time Systems.

Neural Network Optimal Feedback Control with Guaranteed Local Stability

A Neural Network Approach for Stochastic Optimal Control

Decentralized Adaptive Neural Inverse Optimal Control of Nonlinear Interconnected Systems

Dynamic Learning from Neural Network‐based Control for Sampled‐data Strict‐feedback Nonlinear Systems

Importance sampling-based approximate optimal planning and control

Neural-network-based Optimal Control for Discrete-Time Nonlinear Systems Using General Value Iteration

Online Optimization of Dynamical Systems with Deep Learning Perception

Adaptive Neural Control for A Class of Pure-Feedback Nonlinear Time-Delay Systems with Asymmetric Saturation Actuators

Sampled-data Stabilization Analysis of Neural-Network-based Control Systems: A Discontinuous Bilateral Looped-Functional Approach

Neural network-based finite-horizon optimal control of uncertain affine nonlinear discrete-time systems

Dynamic Event-Sampled Control of Interconnected Nonlinear Systems Using Reinforcement Learning

Neural network-based finite horizon stochastic optimal control design for nonlinear networked control systems

Stable Neural-Network-based Adaptive Control for Sampled-Data Nonlinear Systems.

Intermittent Feedback Optimal Control of Saturated-Input Nonlinear Systems via Adaptive Dynamic Programming

A Deep Learning Approach to Optimal Sampling Problems