Abstract:Finding a feasible and prompt solution to the Vehicle Routing Problem (VRP) is a prerequisite for efficient freight transportation, seamless logistics, and sustainable mobility. Traditional optimization methods reach their limits when confronted with the real-world complexity of VRPs, which involve numerous constraints and objectives. Recently, the ability of generative Artificial Intelligence (AI) to solve combinatorial tasks, known as Neural Combinatorial Optimization (NCO), demonstrated promising results, offering new perspectives. In this study, we propose an NCO approach to solve a time-constrained capacitated VRP with a finite vehicle fleet size. The approach is based on an encoder-decoder architecture, formulated in line with the Policy Optimization with Multiple Optima (POMO) protocol and trained via a Proximal Policy Optimization (PPO) algorithm. We successfully trained the policy with multiple objectives (minimizing the total distance while maximizing vehicle utilization) and evaluated it on medium and large instances, benchmarking it against state-of-the-art heuristics. The method is able to find adequate and cost-efficient solutions, showing both flexibility and robust generalization. Finally, we provide a critical analysis of the solution generated by NCO and discuss the challenges and opportunities of this new branch of intelligent learning algorithms emerging in optimization science, focusing on freight transportation.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the Vehicle Routing Problem (VRP) with time - window constraints and a limited number of vehicles. Specifically, the author proposes a method based on Neural Combinatorial Optimization (NCO) to deal with this complex combinatorial optimization problem. The following are the main objectives of this research: 1. **Improve solution efficiency**: Traditional optimization methods often struggle to quickly find feasible solutions when facing complex VRPs in the real world. Therefore, this study introduces Generative AI and Reinforcement Learning (RL), especially the Pointer Network and Proximal Policy Optimization (PPO) algorithms, to improve solution efficiency. 2. **Handle multi - objective optimization**: In actual logistics scenarios, multiple objectives usually need to be considered simultaneously, such as minimizing the total travel distance and maximizing vehicle utilization. To this end, the study designs and calibrates the reward function to balance these conflicting objectives. 3. **Integrate real - time constraints**: In order to make the model more flexible and adaptable, the study introduces a method named "Attention Score Amplifier for Pointer Network (ASAP)". This method effectively integrates customer - related time - window constraints by dynamically adjusting attention scores and giving priority to emergency services during route generation. 4. **Expand the existing framework**: The study extends the existing NCO framework, especially the Reinforcement Learning for Operation Research (RLOR) framework, and applies the Policy Optimization with Multiple Optima (POMO) exploration scheme to better handle the problems of a limited number of vehicles and hard time - window constraints. 5. **Verify and evaluate**: The study conducts extensive experimental evaluations on the proposed model, including benchmark tests on medium - scale and large - scale instances, and compares it with existing state - of - the - art heuristic algorithms, demonstrating its advantages in finding efficient and low - cost solutions. In summary, this paper is mainly dedicated to developing a new method that can quickly and flexibly solve the VRP with time - window constraints and a limited number of vehicles, providing new perspectives and technical support for intelligent logistics systems.

Learn to Solve Vehicle Routing Problems ASAP: A Neural Optimization Approach for Time-Constrained Vehicle Routing Problems with Finite Vehicle Fleet

Neural Combinatorial Optimization Algorithms for Solving Vehicle Routing Problems: A Comprehensive Survey with Perspectives

A Neural Multi-Objective Capacitated Vehicle Routing Optimization Algorithm Based on Preference Adjustment

A Case Study of Vehicle Route Optimization

Online Vehicle Routing With Neural Combinatorial Optimization and Deep Reinforcement Learning

Deep Reinforcement Learning Algorithm for Fast Solutions to Vehicle Routing Problem with Time-Windows

Neural Large Neighborhood Search for the Capacitated Vehicle Routing Problem

Asynchronous Optimization of Part Logistics Routing Problem

Multi-objective vehicle routing and loading with time window constraints: a real-life application

Data Driven VRP: A Neural Network Model to Learn Hidden Preferences for VRP

Genetic Algorithms with Neural Cost Predictor for Solving Hierarchical Vehicle Routing Problems

A deeper look back at Y

A Deep Reinforcement Learning-Based Adaptive Search for Solving Time-Dependent Green Vehicle Routing Problem

Learning to Handle Complex Constraints for Vehicle Routing Problems

Neural Networks for Vehicle Routing Problem

Solving the capacitated vehicle routing problem with time windows via graph convolutional network assisted tree search and quantum-inspired computing

Multi-Vehicle Routing Problems with Soft Time Windows: A Multi-Agent Reinforcement Learning Approach

A multi-agent deep reinforcement learning approach for solving the multi-depot vehicle routing problem

Combinatorial Optimization enriched Machine Learning to solve the Dynamic Vehicle Routing Problem with Time Windows

Dynamic Vehicle Routing Solution in the Framework of Nature-Inspired Algorithms

An Adaptive Spiking Neural P System for Solving Vehicle Routing Problems