A deep learning Attention model to solve the Vehicle Routing Problem and the Pick-up and Delivery Problem with Time Windows

Baptiste Rabecq,Rémy Chevrier
DOI: https://doi.org/10.48550/arXiv.2212.10399
2023-01-10
Abstract:SNCF, the French public train company, is experimenting to develop new types of transportation services by tackling vehicle routing problems. While many deep learning models have been used to tackle efficiently vehicle routing problems, it is difficult to take into account time related constraints. In this paper, we solve the Capacitated Vehicle Routing Problem with Time Windows (CVRPTW) and the Capacitated Pickup and Delivery Problem with Time Windows (CPDPTW) with a constructive iterative Deep Learning algorithm. We use an Attention Encoder-Decoder structure and design a novel insertion heuristic for the feasibility check of the CPDPTW. Our models yields results that are better than best known learning solutions on the CVRPTW. We show the feasibility of deep learning techniques for solving the CPDPTW but witness the limitations of our iterative approach in terms of computational complexity.
Artificial Intelligence,Optimization and Control
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to use the attention mechanism in deep learning to solve the capacitated vehicle routing problem with time windows (C - VRP - TW) and the capacitated pickup and delivery problem with time windows (C - PDP - TW). Specifically: 1. **Capacitated vehicle routing problem with time windows (C - VRP - TW)**: - **Problem description**: Given a graph, which includes a depot node and multiple customer nodes, as well as a set of vehicles. Each customer node has a demand quantity and a time window, and the vehicle must arrive and complete the service within this time window. The goal is to minimize the total travel distance of all vehicles under the premise of satisfying the vehicle capacity limitations and time window constraints. - **Challenge**: The introduction of time windows increases the complexity of the problem, making it difficult for traditional heuristic methods to solve efficiently on large - scale instances. 2. **Capacitated pickup and delivery problem with time windows (C - PDP - TW)**: - **Problem description**: Based on C - VRP - TW, the requirements of pickup and delivery are added. Each customer node is divided into a pickup node and a delivery node, and the pickup node must be visited before the delivery node. The goal is also to minimize the total travel distance of all vehicles under the premise of satisfying all constraints. - **Challenge**: In addition to the complexity of time windows, the order of pickup and delivery also needs to be considered, which further increases the difficulty of the problem. ### Main contributions of the paper 1. **Environment design**: The authors designed a new computing environment that can handle time window and priority constraints. In particular, they introduced an insertion heuristic algorithm to quickly check the feasibility of solutions, thereby reducing the complexity from exponential to polynomial. 2. **Deep learning model**: The attention encoder - decoder structure (Attention Encoder - Decoder) was used, and the multi - head attention mechanism (Multi - Head Attention) was adopted to improve the expressive ability of the model. 3. **Reinforcement learning framework**: The REINFORCE algorithm was adopted and combined with the POMO baseline to reduce the variance of the model and improve the training efficiency. ### Experimental results - **C - VRP - TW**: The experimental results on the Solomon data set show that this model is superior to other learning - based methods in terms of the total travel distance, but is slightly inferior in the number of vehicles used. - **C - PDP - TW**: The experimental results on the Li and Lim data sets show that although this model cannot compete with the existing best solutions in terms of the total travel distance and the number of vehicles used, it can provide multiple feasible solutions in a relatively short time and has high practical value. ### Conclusion This paper shows how to use an end - to - end deep learning model to solve complex planning problems, especially in the case of time windows and priority constraints. Although there is still room for improvement in some aspects, this research provides a new idea and method for solving such problems.