Deep Reinforcement Learning for Large-Scale TSP Graph

Hua Yang
DOI: https://doi.org/10.1109/SMC53992.2023.10394240
2023-10-01
Abstract:Last few years, the Transformer network architecture has had better performance than both Convolutional Neural Networks (CNN) and Recursive Neural Networks (RNN). The Vision Transformer is better than CNN in Computer Vision, and the original Transformer is far ahead in Natural Language Processing. Nevertheless, in combinatorial optimization, the Transformer can barely handle some combinatorial optimization problems such as large-scale Traveling Salesman Problem (TSP). Therefore, we design a more straightforward Transformer-based network structure, termed TSP Transformer, to deal with large-scale Traveling Salesman Problems. To better handle tasks in combinatorial optimization, we have made improvements to the Transformer network structure. We train the TSP Transformer network architecture to predict a distribution over different city permutations with the input of a set of city node graph coordinates, use negative tour length as the reward, and optimize the parameters of the TSP Transformer network using a policy gradient method. The extensive experimental results show that the TSP Transformer network structure can increase the effect by five times compared with the previous work of other authors, and the optimal ratio gap has been reduced from 1.22% to 0.24%.
Computer Science
What problem does this paper attempt to address?