Abstract:This paper introduces CARSS (Cooperative Attention-guided Reinforcement Subpath Synthesis), a novel approach to address the Traveling Salesman Problem (TSP) by leveraging cooperative Multi-Agent Reinforcement Learning (MARL). CARSS decomposes the TSP solving process into two distinct yet synergistic steps: "subpath generation" and "subpath merging." In the former, a cooperative MARL framework is employed to iteratively generate subpaths using multiple agents. In the latter, these subpaths are progressively merged to form a complete cycle. The algorithm's primary objective is to enhance efficiency in terms of training memory consumption, testing time, and scalability, through the adoption of a multi-agent divide and conquer paradigm. Notably, attention mechanisms play a pivotal role in feature embedding and parameterization strategies within CARSS. The training of the model is facilitated by the independent REINFORCE algorithm. Empirical experiments reveal CARSS's superiority compared to single-agent alternatives: it demonstrates reduced GPU memory utilization, accommodates training graphs nearly 2.5 times larger, and exhibits the potential for scaling to even more extensive problem sizes. Furthermore, CARSS substantially reduces testing time and optimization gaps by approximately 50% for TSP instances of up to 1000 vertices, when compared to standard decoding methods.

What problem does this paper attempt to address?

This paper attempts to solve the Traveling Salesman Problem (TSP), which is a classic combinatorial optimization challenge aiming to find the shortest path to visit a set of cities and return to the starting point. TSP is an NP - hard problem, and traditional methods such as exact algorithms based on the cutting - plane method or heuristic algorithms based on dynamic programming face challenges in terms of scalability and optimality when dealing with large - scale problems. To address these issues, the paper introduces a new method - **CARSS (Cooperative Attention - guided Reinforcement Subpath Synthesis)**, which utilizes multi - agent reinforcement learning (MARL) to deal with TSP. The main contributions of CARSS and the methods to solve the problem are as follows: 1. **Two - stage strategy**: - **Sub - path generation**: Iteratively generate sub - paths through a cooperative MARL framework. Each agent is responsible for constructing part of the sub - path and works collaboratively to achieve the optimal solution. - **Sub - path merging**: Gradually merge these sub - paths into a complete cycle, ultimately forming a solution to TSP. 2. **Application of the attention mechanism**: - The attention mechanism plays a crucial role in feature embedding and parameterized strategies, enhancing the agents' ability to capture relevant information and improving learning efficiency. 3. **Improvement in training and testing efficiency**: - CARSS uses an independent REINFORCE algorithm for model training, which significantly reduces GPU memory usage, can handle training graphs that are almost 2.5 times larger, and reduces the testing time and optimization gap by approximately 50% when the TSP instance size reaches 1,000 vertices. 4. **Enhanced scalability**: - Through the multi - agent divide - and - conquer paradigm, CARSS improves the scalability of the algorithm and can maintain the quality of solutions on larger - scale problem instances. In summary, by introducing cooperative multi - agent reinforcement learning and the attention mechanism, CARSS decomposes the TSP solving process into two steps: sub - path generation and merging, thereby effectively solving the memory consumption, testing time, and scalability problems encountered by traditional methods when dealing with large - scale TSP.

CARSS: Cooperative Attention-guided Reinforcement Subpath Synthesis for Solving Traveling Salesman Problem

Learning to Cooperate: Application of Deep Reinforcement Learning for Online AGV Path Finding.

TraCo: Learning Virtual Traffic Coordinator for Cooperation with Multi-Agent Reinforcement Learning.

S2rl

AMARL: An Attention-Based Multiagent Reinforcement Learning Approach to the Min-Max Multiple Traveling Salesmen Problem

Routing optimization with Monte Carlo Tree Search-based multi-agent reinforcement learning

A Deep Reinforcement Learning Based Real-Time Solution Policy for the Traveling Salesman Problem

Network Clustering-Based Multi-Agent Reinforcement Learning for Large-Scale Traffic Signal Control

Reinforcement Learning-based Non-Autoregressive Solver for Traveling Salesman Problems

Cooperative Path Planning with Asynchronous Multiagent Reinforcement Learning

A Columnar Competitive Model for Solving Multi-Traveling Salesman Problem

Multi-agent Deep Reinforcement Learning collaborative Traffic Signal Control method considering intersection heterogeneity

Multiagent optimization system for solving the traveling salesman problem (TSP)

Towards Multi-agent Reinforcement Learning based Traffic Signal Control through Spatio-temporal Hypergraphs

A deep reinforcement learning approach for solving the Traveling Salesman Problem with Drone

Combining Reinforcement Learning with Lin-Kernighan-Helsgaun Algorithm for the Traveling Salesman Problem

Multi-Vehicle Routing Problems with Soft Time Windows: A Multi-Agent Reinforcement Learning Approach

Combining reinforcement learning algorithm and genetic algorithm to solve the traveling salesman problem

Confidence-Based Curriculum Learning for Multi-Agent Path Finding

SCRIMP: Scalable Communication for Reinforcement- and Imitation-Learning-Based Multi-Agent Pathfinding

Mastering Arterial Traffic Signal Control with Multi-Agent Attention-Based Soft Actor-Critic Model