Abstract:We apply Multi-Agent Deep Reinforcement Learning (MADRL) to multi-echelon inventory management problems and evaluate MADRL’s performance to minimize the overall costs of a supply chain. We also examine whether the upfront-only information-sharing mechanism used in MADRL helps alleviate the bullwhip effect in a supply chain. We apply Heterogeneous-Agent Proximal Policy Optimization (HAPPO), a MADRL algorithm, to the decentralized multi-echelon inventory management problems in both a serial supply chain and a supply chain network. Our results show that policies constructed by HAPPO achieve lower overall costs than policies constructed by single-agent deep reinforcement learning and other heuristic policies. Also, the application of HAPPO results in a less significant bullwhip effect than policies constructed by single-agent deep reinforcement learning where information is not shared among actors. Somewhat surprisingly, compared to using the overall costs of the system as a minimization target for each actor, HAPPO achieves lower overall costs when the minimization target for each actor is a combination of its own costs and the overall costs of the system. Our results provide a new perspective on the benefit of information sharing inside the supply chain that helps alleviate the bullwhip effect and improve the overall performance of the system. Upfront information sharing and action coordination in model training among actors is essential, with the former more essential, for improving a supply chain’s overall performance when applying MADRL. Neither actors being fully self-interested nor actors being fully system-focused leads to the best practical performance of policies learned and constructed by MADRL. Our results also verify MADRL’s potential in solving various multi-echelon inventory management problems with complex supply chain structures and in non-stationary market environments.

A Versatile Multi-Agent Reinforcement Learning Benchmark for Inventory Management

Learning to Cooperate: Application of Deep Reinforcement Learning for Online AGV Path Finding.

MARLIM: Multi-Agent Reinforcement Learning for Inventory Management

Multiagent Reinforcement Learning for Strictly Constrained Tasks Based on Reward Recorder

Multi-Agent Reinforcement Learning with Shared Resources for Inventory Management

Multi-Agent Reinforcement Learning with Shared Resources for Inventory Management

Cooperative Multi-Agent Reinforcement Learning for Inventory Management

NeuronsMAE: A Novel Multi-Agent Reinforcement Learning Environment for Cooperative and Competitive Multi-Robot Tasks

MOMAland: A Set of Benchmarks for Multi-Objective Multi-Agent Reinforcement Learning

Learning and Calibrating Heterogeneous Bounded Rational Market Behaviour with Multi-Agent Reinforcement Learning

InvAgent: A Large Language Model based Multi-Agent System for Inventory Management in Supply Chains

Multi-Agent Reinforcement Learning for Autonomous Driving: A Survey

From Multi-agent to Multi-robot: A Scalable Training and Evaluation Platform for Multi-robot Reinforcement Learning

Whittle Index with Multiple Actions and State Constraint for Inventory Management

Multi-Agent Deep Reinforcement Learning for Multi-Echelon Inventory Management

An Analysis of Multi-Agent Reinforcement Learning for Decentralized Inventory Control Systems

IntersectionZoo: Eco-driving for Benchmarking Multi-Agent Contextual Reinforcement Learning

IMP-MARL: a Suite of Environments for Large-scale Infrastructure Management Planning via MARL

Multi-agent Reinforcement Learning for Dynamic Dispatching in Material Handling Systems

Scalable Multi-Agent Reinforcement Learning for Warehouse Logistics with Robotic and Human Co-Workers