SmartPathfinder: Pushing the Limits of Heuristic Solutions for Vehicle Routing Problem with Drones Using Reinforcement Learning

Navid Mohammad Imran,Myounggyu Won
2024-04-14
Abstract:The Vehicle Routing Problem with Drones (VRPD) seeks to optimize the routing paths for both trucks and drones, where the trucks are responsible for delivering parcels to customer locations, and the drones are dispatched from these trucks for parcel delivery, subsequently being retrieved by the trucks. Given the NP-Hard complexity of VRPD, numerous heuristic approaches have been introduced. However, improving solution quality and reducing computation time remain significant challenges. In this paper, we conduct a comprehensive examination of heuristic methods designed for solving VRPD, distilling and standardizing them into core elements. We then develop a novel reinforcement learning (RL) framework that is seamlessly integrated with the heuristic solution components, establishing a set of universal principles for incorporating the RL framework with heuristic strategies in an aim to improve both the solution quality and computation speed. This integration has been applied to a state-of-the-art heuristic solution for VRPD, showcasing the substantial benefits of incorporating the RL framework. Our evaluation results demonstrated that the heuristic solution incorporated with our RL framework not only elevated the quality of solutions but also achieved rapid computation speeds, especially when dealing with extensive customer locations.
Computers and Society,Machine Learning
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper primarily proposes a new solution for the Vehicle Routing Problem with Drones (VRPD). Specifically: 1. **Problem Background**: - VRPD is an extension of the traditional Vehicle Routing Problem (VRP), combining both truck and drone delivery methods to optimize delivery routes. - Current methods for solving VRPD mostly rely on heuristic or metaheuristic algorithms, but due to the NP-hard nature of the problem, these methods have limitations when dealing with large-scale instances. 2. **Objectives**: - Improve the quality of the solution. - Reduce computation time. 3. **Research Methods**: - The paper first conducts a comprehensive analysis of existing VRPD heuristic algorithms and decomposes them into general components. - It proposes a new framework called SmartPathfinder, which combines Reinforcement Learning (RL) with existing heuristic algorithms to enhance solution quality and computational efficiency. - By integrating the RL framework with state-of-the-art heuristic algorithms (such as those based on genetic algorithms), the effectiveness of this approach is demonstrated. 4. **Experimental Results**: - Experiments show that by integrating the RL framework, not only is the solution quality improved, but computation time is also significantly reduced, especially when dealing with large-scale customer nodes. - Specifically, compared to heuristic algorithms without RL integration, the RL+MA method reduces total operation time by up to 23.7% when handling 100 customer nodes, and computation time is also shortened. Through the above research, this paper aims to overcome the limitations of existing heuristic algorithms in handling large-scale VRPD problems, providing a more efficient and high-quality solution.