Abstract:Evolutionary Reinforcement Learning (ERL) has garnered widespread attention in recent years due to its inherent robustness and parallelism. However, the integration of Evolutionary Algorithms (EAs) and Reinforcement Learning (RL) remains relatively rudimentary and lacks dynamism, which can impact the convergence performance of ERL algorithms. In this study, a dynamic adaptive module is introduced to balance the Evolution Strategies (ES) and RL training within ERL. By incorporating elite strategies, this module leverages advantageous individuals to elevate the overall population's performance. Additionally, RL strategy updates often lack guidance from the population. To address this, we incorporate the strategies of the best individuals from the population, providing valuable policy direction. This is achieved through the formulation of a loss function that employs either L1 or L2 regularization to facilitate RL training. The proposed framework is referred to as Adaptive Evolutionary Reinforcement Learning (AERL). The effectiveness of our framework is evaluated by adopting Soft Actor-Critic (SAC) as the RL algorithm and comparing it with other algorithms in the MuJoCo environment. The results underscore the outstanding convergence performance of our proposed Adaptive Evolutionary Soft Actor-Critic (AESAC) algorithm. Furthermore, ablation experiments are conducted to emphasize the necessity of these two improvements. It is worth noting that the enhancements in AESAC are realized at the population level, enabling broader exploration and effectively reducing the risk of falling into local optima.

Maximum Entropy Reinforcement Learning with Evolution Strategies

Trust Region Evolution Strategies.

Accelerating Reinforcement Learning with a Directional-Gaussian-Smoothing Evolution Strategy

Instance Weighted Incremental Evolution Strategies for Reinforcement Learning in Dynamic Environments

Soft Policy Gradient Method for Maximum Entropy Deep Reinforcement Learning

Challenges in High-Dimensional Reinforcement Learning with Evolution Strategies

MQES: Max-Q Entropy Search for Efficient Exploration in Continuous Reinforcement Learning

Solving Deep Reinforcement Learning Tasks with Evolution Strategies and Linear Policy Networks

Proximal evolutionary strategy: improving deep reinforcement learning through evolutionary policy optimization

Deep Reinforcement Learning Versus Evolution Strategies: A Comparative Survey

Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents

A Max-Min Entropy Framework for Reinforcement Learning

Adaptive Evolutionary Reinforcement Learning with Policy Direction

Maximum Entropy Model-based Reinforcement Learning

Extremum-Seeking Action Selection for Accelerating Policy Optimization

Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow

S$^2$AC: Energy-Based Reinforcement Learning with Stein Soft Actor Critic

Evolving Constrained Reinforcement Learning Policy

Hard-Thresholding Meets Evolution Strategies in Reinforcement Learning

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor