Abstract:This research introduces a groundbreaking approach to supply chain optimization and management, termed as Deep Reinforcement Learning based Supply Chain Optimization and Management (DRL-SCOM). At the core of this approach is the utilization of advancements in Deep Reinforcement Learning (DRL), specifically through the integration of Randomized Ensembled Double Q-learning (REDQ) and Trust Region Policy Optimization (TRPO). DRL-SCOM is designed to effectively tackle the inherent complexities and dynamic challenges that are characteristic of supply chain management. One of the key strengths of DRL-SCOM lies in its use of REDQ, which plays a crucial role in mitigating the overestimation bias commonly associated with traditional Q-learning methods. This results in more accurate value estimation and policy improvement, a critical factor in the effective management of supply chains. Additionally, the integration of TRPO into the framework brings the advantage of safe and stable policy updates. Such stability is vital for maintaining the robustness required in the fluctuating environment of supply chain operations. The combination of REDQ and TRPO in DRL-SCOM creates a powerful synergy. REDQ’s ensembled learning approach, when fused with TRPO’s trust-region method, enables the framework to efficiently navigate the complex and high-dimensional decision space typical of supply chains. This allows for real-time optimization of decisions while staying within operational constraints. The DRL-SCOM methodology shows significant potential in addressing various aspects of supply chain management, from demand forecasting and inventory management to logistics, adeptly handling the nonlinearities and uncertainties that are prevalent in these areas. Thus, the DRL-SCOM framework emerges as an innovative solution, pushing the frontiers of traditional supply chain management. It paves the way for a more agile, responsive, and intelligent system, equipped to adapt to changing market demands and operational challenges. This approach represents a significant stride towards transforming supply chain management into a more advanced, data-driven, and adaptive field.

A Multi-Agent Coordination of a Supply Chain Ordering Management with Multiple Members Using Reinforcement Learning

Coordination of Supply Chains involving Various Agents with Behavioral Preferences

Deep Reinforcement Learning Approach for Capacitated Supply Chain optimization under Demand Uncertainty

Case-based reinforcement learning for dynamic inventory control in a multi-agent supply-chain system

Multi-Agent Deep Reinforcement Learning for Multi-Echelon Inventory Management

Agent based modelling for continuously varying supply chains

Solving Inventory Management Problems Through Deep Reinforcement Learning

Reinforcement Learning for Multi-Product Multi-Node Inventory Management in Supply Chains

Reinforcement Learning Provides a Flexible Approach for Realistic Supply Chain Safety Stock Optimisation

Cooperative Multi-Agent Reinforcement Learning for Inventory Management

Performance of deep reinforcement learning algorithms in two-echelon inventory control systems

Deep reinforcement learning for demand fulfillment in online retail

A Reinforcement Learning Based Approach For Multi-Projects Scheduling In Cloud Manufacturing

A Cooperative Multi-Agent Reinforcement Learning Framework for Resource Balancing in Complex Logistics Network

Research on Supply Chain Optimization and Management Based on Deep Reinforcement Learning

Hierarchical Reinforcement Learning for Crude Oil Supply Chain Scheduling

Deep Reinforcement Learning for Large-Scale Inventory Management

Can Deep Reinforcement Learning Improve Inventory Management? Performance on Dual Sourcing, Lost Sales and Multi-Echelon Problems

Deep Reinforcement Learning for Multi-Truck Vehicle Routing Problems with Multi-Leg Demand Routes

An Analysis of Multi-Agent Reinforcement Learning for Decentralized Inventory Control Systems

Optimizing Robotic Mobile Fulfillment Systems for Order Picking Based on Deep Reinforcement Learning