CARSS: Cooperative Attention-guided Reinforcement Subpath Synthesis for Solving Traveling Salesman Problem

Yuchen Shi,Congying Han,Tiande Guo
DOI: https://doi.org/10.48550/arXiv.2312.15412
2023-12-24
Abstract:This paper introduces CARSS (Cooperative Attention-guided Reinforcement Subpath Synthesis), a novel approach to address the Traveling Salesman Problem (TSP) by leveraging cooperative Multi-Agent Reinforcement Learning (MARL). CARSS decomposes the TSP solving process into two distinct yet synergistic steps: "subpath generation" and "subpath merging." In the former, a cooperative MARL framework is employed to iteratively generate subpaths using multiple agents. In the latter, these subpaths are progressively merged to form a complete cycle. The algorithm's primary objective is to enhance efficiency in terms of training memory consumption, testing time, and scalability, through the adoption of a multi-agent divide and conquer paradigm. Notably, attention mechanisms play a pivotal role in feature embedding and parameterization strategies within CARSS. The training of the model is facilitated by the independent REINFORCE algorithm. Empirical experiments reveal CARSS's superiority compared to single-agent alternatives: it demonstrates reduced GPU memory utilization, accommodates training graphs nearly 2.5 times larger, and exhibits the potential for scaling to even more extensive problem sizes. Furthermore, CARSS substantially reduces testing time and optimization gaps by approximately 50% for TSP instances of up to 1000 vertices, when compared to standard decoding methods.
Machine Learning,Multiagent Systems
What problem does this paper attempt to address?
This paper attempts to solve the Traveling Salesman Problem (TSP), which is a classic combinatorial optimization challenge aiming to find the shortest path to visit a set of cities and return to the starting point. TSP is an NP - hard problem, and traditional methods such as exact algorithms based on the cutting - plane method or heuristic algorithms based on dynamic programming face challenges in terms of scalability and optimality when dealing with large - scale problems. To address these issues, the paper introduces a new method - **CARSS (Cooperative Attention - guided Reinforcement Subpath Synthesis)**, which utilizes multi - agent reinforcement learning (MARL) to deal with TSP. The main contributions of CARSS and the methods to solve the problem are as follows: 1. **Two - stage strategy**: - **Sub - path generation**: Iteratively generate sub - paths through a cooperative MARL framework. Each agent is responsible for constructing part of the sub - path and works collaboratively to achieve the optimal solution. - **Sub - path merging**: Gradually merge these sub - paths into a complete cycle, ultimately forming a solution to TSP. 2. **Application of the attention mechanism**: - The attention mechanism plays a crucial role in feature embedding and parameterized strategies, enhancing the agents' ability to capture relevant information and improving learning efficiency. 3. **Improvement in training and testing efficiency**: - CARSS uses an independent REINFORCE algorithm for model training, which significantly reduces GPU memory usage, can handle training graphs that are almost 2.5 times larger, and reduces the testing time and optimization gap by approximately 50% when the TSP instance size reaches 1,000 vertices. 4. **Enhanced scalability**: - Through the multi - agent divide - and - conquer paradigm, CARSS improves the scalability of the algorithm and can maintain the quality of solutions on larger - scale problem instances. In summary, by introducing cooperative multi - agent reinforcement learning and the attention mechanism, CARSS decomposes the TSP solving process into two steps: sub - path generation and merging, thereby effectively solving the memory consumption, testing time, and scalability problems encountered by traditional methods when dealing with large - scale TSP.