Abstract:The heterogeneous fleet and demand vehicle routing problem with time-window constraints (HFDVRPTW) is a crucial optimization problem of significant importance in real-world logistics operations. In this paper, we propose a deep reinforcement learning (DRL)-based method, termed spatial Edge-Feature EnhanCed mulTIgraph fusion encoder With spectral-based embedding and hieRarchical decOder with learnable TEmpoRal positional embedding (EFECTIW-ROTER, pronounced "Effective Router"), to tackle this complex and practical optimization problem. EFECTIW-ROTER utilizes two sparse graphs to represent node connectivity, where nodes correspond to customers and the depot. This sparsity results from the time-window constraints and customers' demand relative to the list of acceptable vehicle attributes specified for service within a heterogeneous fleet, determined by the reachability of the nodes based on these two factors. Leveraging two graph Transformer models, EFECTIW-ROTER's encoding module captures the interactions between the nodes based on these factors. One model encodes customers' heterogeneous demand with spatial edge features based on travel time between the nodes, while the second employs temporal positional embeddings to capture temporal relationships based on time-window ordering. A fusion model is introduced to integrate node interactions based on these graphs. Additionally, a spectral-attention-based pooling ensures effective state representation for the DRL-based method. EFECTIW-ROTER features a hierarchical attention decoder operating in two stages: heterogeneous vehicle selection and node selection. Enhanced with positional embeddings, the decoder is empowered to make effective routing decisions based on time-window constraints' ordering. Experimental results using real-world traffic data from two major Canadian cities confirm EFECTIW-ROTER's better performance over current state-of-the-art DRL-based and heuristic methods. EFECTIW-ROTER reduces travel times while also achieving faster computational times when compared to conventional heuristics. Additional experiments demonstrate its generalizability across larger instances.

An Improved Transformer Model with Multi-Head Attention and Attention to Attention for Low-Carbon Multi-Depot Vehicle Routing Problem

Multi-service Provision for Electric Vehicles in Power-Transportation Networks Towards a Low-Carbon Transition: A Hierarchical and Hybrid Multi-Agent Reinforcement Learning Approach

SPformer: A Transformer Based DRL Decision Making Method for Connected Automated Vehicles

Multi-Vehicle Routing Problems with Soft Time Windows: A Multi-Agent Reinforcement Learning Approach

Graph Transformer with Reinforcement Learning for Vehicle Routing Problem

Energy-optimal routing for electric vehicles using deep reinforcement learning with transformer

Deep Reinforcement Learning for Multi-Truck Vehicle Routing Problems with Multi-Leg Demand Routes

Multi-Task Multi-Objective Evolutionary Search Based on Deep Reinforcement Learning for Multi-Objective Vehicle Routing Problems with Time Windows

Graph attention reinforcement learning with flexible matching policies for multi-depot vehicle routing problems

A multi-agent deep reinforcement learning approach for solving the multi-depot vehicle routing problem

Logistics Distribution Route Optimization With Time Windows Based on Multi-Agent Deep Reinforcement Learning

EFECTIW-ROTER: Deep Reinforcement Learning Approach for Solving Heterogeneous Fleet and Demand Vehicle Routing Problem with Time-Window Constraints

A deep reinforcement learning approach for solving the Traveling Salesman Problem with Drone

Combining decomposition and graph capsule network for multi-objective vehicle routing optimization

Multiobjective Vehicle Routing Optimization with Time Windows: A Hybrid Approach Using Deep Reinforcement Learning and NSGA-II

Multi-Decoder Attention Model with Embedding Glimpse for Solving Vehicle Routing Problems

Routing optimization with Monte Carlo Tree Search-based multi-agent reinforcement learning

Multi-Agent Spatial-Temporal Transformer for Traffic Signal Control

Solving the VRP Using Transformer-Based Deep Reinforcement Learning

Multi-Objective Combinatorial Optimization Algorithm Based on Asynchronous Advantage Actor–Critic and Graph Transformer Networks

Improved ant colony optimization for the vehicle routing problem with split pickup and split delivery