Abstract:Several studies have attempted to solve traveling salesman problems (TSPs) using various deep learning techniques. Among them, Transformer-based models show state-of-the-art performance even for large-scale Traveling Salesman Problems (TSPs). However, they are based on fully-connected attention models and suffer from large computational complexity and GPU memory usage. Our work is the first CNN-Transformer model based on a CNN embedding layer and partial self-attention for TSP. Our CNN-Transformer model is able to better learn spatial features from input data using a CNN embedding layer compared with the standard Transformer-based models. It also removes considerable redundancy in fully-connected attention models using the proposed partial self-attention. Experimental results show that the proposed CNN embedding layer and partial self-attention are very effective in improving performance and computational complexity. The proposed model exhibits the best performance in real-world datasets and outperforms other existing state-of-the-art (SOTA) Transformer-based models in various aspects. Our code is publicly available at https://github.com/cm8908/CNN_Transformer3.

What problem does this paper attempt to address?

This paper is primarily dedicated to solving the Traveling Salesman Problem (TSP) using deep learning techniques, particularly by combining Convolutional Neural Networks (CNN) and Transformer models. Specifically, the research team proposed a lightweight CNN-Transformer model to address TSP. ### Research Background TSP is a classic NP-hard problem that has been widely studied in computer science and operations research. It aims to find the shortest path such that a "traveling salesman" can visit each city exactly once and then return to the starting point. As the number of cities increases, finding the optimal solution becomes very computationally intensive. Therefore, researchers have developed various heuristic and approximation algorithms to find high-quality solutions within a reasonable time frame. ### Solution The paper proposes a novel CNN-Transformer model, which is based on partial self-attention mechanisms and utilizes CNN embedding layers to extract spatial features from the input data. This combination allows the model to better learn the spatial characteristics of the input data and improves computational complexity and GPU memory usage by reducing redundant connections in the fully connected attention model. ### Main Contributions 1. **CNN-Transformer Model**: This is the first CNN-Transformer model used to solve TSP. Experiments show that the CNN embedding layer is very effective in learning local spatial features of various TSP instances. 2. **Partial Self-Attention Mechanism**: The model employs a partial self-attention mechanism that performs attention operations only on the most recently visited nodes, thereby enhancing the ability to learn local combinatorial properties. 3. **Efficiency Improvement**: By removing redundant attention connections in the decoder, the model significantly reduces GPU memory usage and has lower inference time. ### Experimental Results The paper validates the effectiveness of the proposed method through multiple experiments. The experimental results show that the model not only performs best on real-world datasets but also surpasses existing state-of-the-art Transformer-based models on various metrics. Notably, it achieves significant results in terms of optimization gap, average predicted path length, and other aspects. Additionally, the model demonstrates good training time and inference time performance, as well as lower GPU memory consumption. In summary, this paper provides a novel and efficient deep learning framework that can effectively solve large-scale TSP problems and has strong practical value.

A lightweight CNN-transformer model for learning traveling salesman problems

A Lightweight CNN-Transformer Model for Learning Traveling Salesman Problems

Solving Optimization Problems Through Fully Convolutional Networks: an Application to the Traveling Salesman Problem

Memory-efficient Transformer-based network model for Traveling Salesman Problem

The Transformer Network for the Traveling Salesman Problem

Deep Reinforcement Learning for Large-Scale TSP Graph

A Deep Reinforcement Learning Based Real-Time Solution Policy for the Traveling Salesman Problem

Towards Feature-free TSP Solver Selection: A Deep Learning Approach

Less Is More -- On the Importance of Sparsification for Transformers and Graph Neural Networks for TSP

A deep reinforcement learning approach for solving the Traveling Salesman Problem with Drone

Unsupervised Learning for Solving the Travelling Salesman Problem

CycleFormer : TSP Solver Based on Language Modeling

HiTSP: Towards a Hierarchical Neural Framework for Large-scale Traveling Salesman Problems

Hierarchical Neural Constructive Solver for Real-world TSP Scenarios

MCT-TTE: Travel Time Estimation Based on Transformer and Convolution Neural Networks

Graph Neural Network Guided Local Search for the Traveling Salesperson Problem

Sparse Mobile Crowdsensing for Cost-Effective Traffic State Estimation with Spatio-Temporal Transformer Graph Neural Network

Spatially Constrained Transformer with Efficient Global Relation Modelling for Spatio-Temporal Prediction

Neural TSP Solver with Progressive Distillation.

Learning dynamic and hierarchical traffic spatiotemporal features with Transformer

An Efficient Hybrid Graph Network Model for Traveling Salesman Problem with Drone