Abstract:Evolutionary Reinforcement Learning (ERL) has garnered widespread attention in recent years due to its inherent robustness and parallelism. However, the integration of Evolutionary Algorithms (EAs) and Reinforcement Learning (RL) remains relatively rudimentary and lacks dynamism, which can impact the convergence performance of ERL algorithms. In this study, a dynamic adaptive module is introduced to balance the Evolution Strategies (ES) and RL training within ERL. By incorporating elite strategies, this module leverages advantageous individuals to elevate the overall population's performance. Additionally, RL strategy updates often lack guidance from the population. To address this, we incorporate the strategies of the best individuals from the population, providing valuable policy direction. This is achieved through the formulation of a loss function that employs either L1 or L2 regularization to facilitate RL training. The proposed framework is referred to as Adaptive Evolutionary Reinforcement Learning (AERL). The effectiveness of our framework is evaluated by adopting Soft Actor-Critic (SAC) as the RL algorithm and comparing it with other algorithms in the MuJoCo environment. The results underscore the outstanding convergence performance of our proposed Adaptive Evolutionary Soft Actor-Critic (AESAC) algorithm. Furthermore, ablation experiments are conducted to emphasize the necessity of these two improvements. It is worth noting that the enhancements in AESAC are realized at the population level, enabling broader exploration and effectively reducing the risk of falling into local optima.

Enhancing Off-Policy Constrained Reinforcement Learning Through Adaptive Ensemble C Estimation

Successive Convex Approximation Based Off-Policy Optimization for Constrained Reinforcement Learning

Evolving Constrained Reinforcement Learning Policy

Mildly Constrained Evaluation Policy for Offline Reinforcement Learning

ACE : Off-Policy Actor-Critic with Causality-Aware Entropy Regularization

Adaptive Evolutionary Reinforcement Learning with Policy Direction

EAT-C: Environment-Adversarial sub-Task Curriculum for Efficient Reinforcement Learning.

Anytime-Competitive Reinforcement Learning with Policy Prior

Causal Coordinated Concurrent Reinforcement Learning

Efficient Exploration Using Extra Safety Budget in Constrained Policy Optimization

Resilient Constrained Reinforcement Learning

A Contrastive-Enhanced Ensemble Framework for Efficient Multi-Agent Reinforcement Learning

UAC: Offline Reinforcement Learning with Uncertain Action Constraint

Offline Reinforcement Learning with Anderson Acceleration for Robotic Tasks

EAT-C: Environment-Adversarial sub-Task Curriculum for RL

Reachability Constrained Reinforcement Learning.

Off-Policy Risk-Sensitive Reinforcement Learning-Based Constrained Robust Optimal Control

Safe Reinforcement Learning via Hierarchical Adaptive Chance-Constraint Safeguards

Density Constrained Reinforcement Learning

Oracle-Efficient Reinforcement Learning for Max Value Ensembles