Abstract:We investigate a data-driven multiperiod inventory replenishment problem with uncertain demand and vendor lead time (VLT) with accessibility to a large quantity of historical data. Different from the traditional two-step predict-then-optimize (PTO) solution framework, we propose a one-step end-to-end (E2E) framework that uses deep learning models to output the suggested replenishment amount directly from input features without any intermediate step. The E2E model is trained to capture the behavior of the optimal dynamic programming solution under historical observations without any prior assumptions on the distributions of the demand and the VLT. By conducting a series of thorough numerical experiments using real data from one of the leading e-commerce companies, we demonstrate the advantages of the proposed E2E model over conventional PTO frameworks. We also conduct a field experiment with JD.com, and the results show that our new algorithm reduces holding cost, stockout cost, total inventory cost, and turnover rate substantially compared with JD’s current practice. For the supply chain management industry, our E2E model shortens the decision process and provides an automatic inventory management solution with the possibility to generalize and scale. The concept of E2E, which uses the input information directly for the ultimate goal, can also be useful in practice for other supply chain management circumstances. This paper was accepted by Hamid Nazerzadeh, big data analytics—fast track. Funding: This research was supported by the National Key Research and Development Program of China [Grant 2018YFB1700600] and National Natural Science Foundation of China [Grants 71991462 and 91746210]. Supplemental Material: The online appendix and data are available at https://doi.org/10.1287/mnsc.2022.4564 .

Deep Reinforcement Learning for Large-Scale Inventory Management

Can Deep Reinforcement Learning Improve Inventory Management? Performance on Dual Sourcing, Lost Sales and Multi-Echelon Problems

Solving Inventory Management Problems Through Deep Reinforcement Learning

Performance of deep reinforcement learning algorithms in two-echelon inventory control systems

Deep Reinforcement Learning for Solving Management Problems: Towards A Large Management Mode

Deep Inventory Management

Deep reinforcement learning for demand fulfillment in online retail

Scalable multi-product inventory control with lead time constraints using reinforcement learning

Deep Reinforcement Learning Approach for Capacitated Supply Chain optimization under Demand Uncertainty

Multi-echelon inventory optimization using deep reinforcement learning

Deep Controlled Learning for Inventory Control

Deep Reinforcement Learning for inventory optimization with non-stationary uncertain demand

A Deep Q-Network Based on Radial Basis Functions for Multi-Echelon Inventory Management

Reinforcement Learning for Multi-Product Multi-Node Inventory Management in Supply Chains

Cooperative Multi-Agent Reinforcement Learning for Inventory Management

A Practical End-to-End Inventory Management Model with Deep Learning

Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management

Deep Reinforcement Learning for Dynamic Order Picking in Warehouse Operations

Case-based reinforcement learning for dynamic inventory control in a multi-agent supply-chain system

Optimizing Robotic Mobile Fulfillment Systems for Order Picking Based on Deep Reinforcement Learning

Deep Reinforcement Learning for Multi-Truck Vehicle Routing Problems with Multi-Leg Demand Routes