Abstract:Training deep reinforcement learning (RL) agents necessitates overcoming the highly unstable nonconvex stochastic optimization inherent in the trial-and-error mechanism. To tackle this challenge, we propose a physics-inspired optimization algorithm called relativistic adaptive gradient descent (RAD), which enhances long-term training stability. By conceptualizing neural network (NN) training as the evolution of a conformal Hamiltonian system, we present a universal framework for transferring long-term stability from conformal symplectic integrators to iterative NN updating rules, where the choice of kinetic energy governs the dynamical properties of resulting optimization algorithms. By utilizing relativistic kinetic energy, RAD incorporates principles from special relativity and limits parameter updates below a finite speed, effectively mitigating abnormal gradient influences. Additionally, RAD models NN optimization as the evolution of a multi-particle system where each trainable parameter acts as an independent particle with an individual adaptive learning rate. We prove RAD's sublinear convergence under general nonconvex settings, where smaller gradient variance and larger batch sizes contribute to tighter convergence. Notably, RAD degrades to the well-known adaptive moment estimation (ADAM) algorithm when its speed coefficient is chosen as one and symplectic factor as a small positive value. Experimental results show RAD outperforming nine baseline optimizers with five RL algorithms across twelve environments, including standard benchmarks and challenging scenarios. Notably, RAD achieves up to a 155.1% performance improvement over ADAM in Atari games, showcasing its efficacy in stabilizing and accelerating RL training.

Design of Interacting Particle Systems for Fast Linear Quadratic RL

Controlled interacting particle algorithms for simulation-based reinforcement learning

Optimal Control for Constrained Discrete-Time Nonlinear Systems Based on Safe Reinforcement Learning.

Data-Efficient Off-Policy Learning for Distributed Optimal Tracking Control of HMAS with Unidentified Exosystem Dynamics.

Data-Driven H-infinity Control with a Real-Time and Efficient Reinforcement Learning Algorithm: An Application to Autonomous Mobility-on-Demand Systems

Reduced-dimensional reinforcement learning control using singular perturbation approximations

Optimal Tracking Control of Nonlinear Multiagent Systems Using Internal Reinforce Q-Learning

Asynchronous Distributed Reinforcement Learning for LQR Control via Zeroth-Order Block Coordinate Descent

Data-Based Optimal Consensus Control for Multiagent Systems With Policy Gradient Reinforcement Learning

Leveraging Randomized Smoothing for Optimal Control of Nonsmooth Dynamical Systems

Reinforcement Learning for Load-balanced Parallel Particle Tracing

Conformal Symplectic Optimization for Stable Reinforcement Learning

A Q-Learning Algorithm for Discrete-Time Linear-Quadratic Control with Random Parameters of Unknown Distribution: Convergence and Stabilization

Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics

A simple and scalable particle swarm optimization structure based on linear system theory

Model-Free Design of Stochastic LQR Controller from Reinforcement Learning and Primal-Dual Optimization Perspective

Asynchronous Heterogeneous Linear Quadratic Regulator Design

Actively Learning Reinforcement Learning: A Stochastic Optimal Control Approach

Constrained stochastic optimal control with learned importance sampling: A path integral approach

Hierarchical Control of Multi-Agent Systems using Online Reinforcement Learning

Game-Based Backstepping Design for Strict-Feedback Nonlinear Multi-Agent Systems Based on Reinforcement Learning