Token-based Deep Reinforcement Learning for Heterogeneous VRP with Service Time Constraints

Yujun Wang,Xiaopeng Hong,Yabin Wang,Junzhou Zhao,Guanghui Sun,Baoxing Qin
DOI: https://doi.org/10.1016/j.knosys.2024.112173
IF: 8.139
2024-01-01
Knowledge-Based Systems
Abstract:Heterogeneous Vehicle Routing aims to construct routes for various vehicles while optimizing an objective with a series of constraints. However, existing deep reinforcement learning-based methods often ignore the service time constraints, which prohibits vehicles from leaving current nodes until the service time is met. This limitation restricts their practical application. To address these concerns, we introduce the Heterogeneous Vehicle Routing Problem with Service Time Constraints (HVRP-STC) and formulate it as a Markov Decision Process with Service Time Constraints. We propose a novel deep reinforcement learning-based model, Token-based Deep Reinforcement Learning (TDRL), to solve this problem.To provide sufficient and timely information for decision making, we design a State Token Coding (STC) mechanism that encodes and updates individual and overall vehicle and node states as tokens of different types. To determine the pairs of vehicles and nodes and generate actions, we propose a Heterogeneous Decoder (HD) with a vehicle-selector and multiple vehicle-specific node-selectors. This decouples the vehicle-node selection tasks and customizes the task of choosing nodes to visit for individual vehicles, better catering to the heterogeneous nature of HVRP-STC.We evaluate the proposed method on four types of datasets with instances of different sizes, large spatial coverage, and varied mathematical model. Our results show that TDRL consistently outperforms state-of-the-art DRL methods. We will release the datasets and the source code of this benchmark with the paper via https://github.com/Vision-Intelligence-and-Robots-Group/ToDRL.
What problem does this paper attempt to address?