Abstract:Finding the optimal game strategy is a difficult problem in unmanned aerial vehicle (UAV) swarm confrontation. As an effective solution to the sequential decision-making problem, multi-agent reinforcement learning (MARL) provides a promising way to realize intelligent countermeasures. However, there are two challenges in applying MARL to large-scale UAV swarm confrontation: i) the curse of dimensionality caused by the excessive scale of UAV clusters and ii) the generalization problem caused by the dynamically changing UAV cluster size. To address these problems, we propose a novel MARL paradigm, called Weighted Mean Field Reinforcement Learning , where the pairwise communication between any UAV and its neighbors is modeled as that between a central UAV and the virtual UAV, which is abstracted from the weighted mean effect of neighboring UAVs. This approach reduces the multi-agent problem to a two-agent problem, which can reduce the input dimension of the agent and adapt to the changing cluster size. The communication content between UAVs includes actions and local observations. Actions can enhance the cooperation between UAVs and alleviate the non-stationarity of the environment, while local observations can expand the perception range of the central UAV so that it can obtain more useful information about the environment. The attention mechanism is leveraged to enable UAVs to select more valuable information flexibly, making our method more scalable than other algorithms. Combining this paradigm with double Q-learning and actor-critic algorithms, we propose weighted mean field Q-learning (WMFQ) and weighted mean field actor-critic (WMFAC) algorithms. Experiments on our constructed UAV swarm confrontation environment verify the effectiveness and scalability of our algorithms.

Large-scale UAV swarm confrontation based on hierarchical attention actor-critic algorithm

UAV Swarm Confrontation Using Hierarchical Multiagent Reinforcement Learning

Hierarchical Decision and Control for Continuous Multitarget Problem: Policy Evaluation with Action Delay

Weighted Mean Field Reinforcement Learning for Large-Scale UAV Swarm Confrontation

Collaborative Decision-Making Method for Multi-UAV Based on Multiagent Reinforcement Learning

Game of Drones: Intelligent Online Decision Making of Multi-UAV Confrontation

Enhancing multi-UAV air combat decision making via hierarchical reinforcement learning

Autonomous and cooperative control of UAV cluster with multi-agent reinforcement learning

A Bio-Inspired Decision-Making Method of UAV Swarm for Attack-Defense Confrontation via Multi-Agent Reinforcement Learning

UAV Cooperative Air Combat Maneuvering Confrontation Based on Multi-agent Reinforcement Learning

An Effective and Scalable Approach for Swarm-on-Swarm Air Combat Decision

Multi-UAV Cooperative Air Combat Decision-Making Based on Multi-Agent Double-Soft Actor-Critic

An evolutionary multi-agent reinforcement learning algorithm for multi-UAV air combat

Group-Based Deep Reinforcement Learning in Multi-UAV Confrontation

Dense Multi-Agent Reinforcement Learning Aided Multi-UAV Information Coverage for Vehicular Networks

Graph-Based Multi-agent Reinforcement Learning for Large-Scale UAVs Swarm System Control

Multi-Uav Automatic Dynamic Obstacle Avoidance With Experience-Shared A2c

Intelligent Distributed Swarm Control for Large-Scale Multi-UAV Systems: A Hierarchical Learning Approach

A Method of Multi-UAV Cooperative Task Assignment Based on Reinforcement Learning

Oracle-Guided Deep Reinforcement Learning for Large-Scale Multi-UAVs Flocking and Navigation.

Hierarchical Reinforcement Learning from Competitive Self-play for Dual-aircraft formation air combat