Reinforcement learning-based particle swarm optimization with neighborhood differential mutation strategy
Wei Li,Peng Liang,Bo Sun,Yafeng Sun,Ying Huang
DOI: https://doi.org/10.1016/j.swevo.2023.101274
IF: 10.267
2023-02-27
Swarm and Evolutionary Computation
Abstract:The particle swarm optimization (PSO) algorithm has been one of the most effective methods for solving various engineering optimization problems. Most existing PSO variants frequently use fixed operators, the adoption of a fixed operator learning mode may restrict the intelligence level of each particle, thus reducing the performance of PSO in solving optimization issues with complicated fitness landscapes. To address single goal real-parameter numerical optimization while overcoming the above shortcoming, this paper proposes a reinforcement learning-based particle swarm optimization with neighborhood differential mutation strategy (NRLPSO). In NRLPSO, a dynamic oscillation inertial weight (DOW) strategy that provides particles with dynamic adjustment ability in different situations is designed. To resolve the operator selection conundrum of exploration and exploitation, a reinforcement learning-based velocity vector generation (VRL) strategy is developed. At each iteration, particles select the velocity update model based on reinforcement learning, and VRL helps to thoroughly search the problem space. A velocity updating mechanism based on cosine similarity (VCS) is applied to control the velocity learning mode to determine more promising solutions. Furthermore, to alleviate the problem of premature convergence, a local update strategy with neighborhood differential mutation (NDM) is employed to increase the diversity of the algorithm. To verify the efficiency of the proposed algorithm, the CEC2017 and CEC2022 test suites are implemented, and nine classic or state-of-the-art PSO variants are comprehensively tested.The experimental results show that NRLPSO outperforms the popular PSO variants in terms of convergence speed and accuracy. Since NRLPSO utilizes the DE mutations, it is compared with the representative LSHADE variant algorithm - LSHADE_SPACMA. Although LSHADE_SPACMA is better than NRLPSO concerning algorithm stability and convergence accuracy, we will refine our work in the future to enhance the performance in all aspects.
computer science, artificial intelligence, theory & methods