Reinforcement Learning for Solving Stochastic Vehicle Routing Problem with Time Windows

Zangir Iklassov,Ikboljon Sobirov,Ruben Solozabal,Martin Takac
2024-02-15
Abstract:This paper introduces a reinforcement learning approach to optimize the Stochastic Vehicle Routing Problem with Time Windows (SVRP), focusing on reducing travel costs in goods delivery. We develop a novel SVRP formulation that accounts for uncertain travel costs and demands, alongside specific customer time windows. An attention-based neural network trained through reinforcement learning is employed to minimize routing costs. Our approach addresses a gap in SVRP research, which traditionally relies on heuristic methods, by leveraging machine learning. The model outperforms the Ant-Colony Optimization algorithm, achieving a 1.73% reduction in travel costs. It uniquely integrates external information, demonstrating robustness in diverse environments, making it a valuable benchmark for future SVRP studies and industry application.
Artificial Intelligence
What problem does this paper attempt to address?
This paper mainly discusses the use of Reinforcement Learning (RL) to solve the Stochastic Vehicle Routing Problem with Time Windows (SVRP). Traditional SVRP research relies on heuristic methods, while this paper proposes a new RL approach that considers uncertain travel costs, demands, and specific customer time windows. By training a neural network with an attention mechanism, the model aims to minimize routing costs. Compared to traditional ant colony optimization algorithms, the new model reduces travel costs by 1.73% and demonstrates robustness in different environments, providing a valuable benchmark for future research and industrial applications. Additionally, the model can integrate external information, adapt to real logistics scenarios, and address the shortcomings of existing research.