Abstract:In retail (e.g., grocery stores, apparel shops, online retailers), inventory managers have to balance short-term risk (no items to sell) with long-term-risk (over ordering leading to product waste). This balancing task is made especially hard due to the lack of information about future customer purchases. In this paper, we study the problem of restocking a grocery store's inventory with perishable items over time, from a distributional point of view. The objective is to maximize sales while minimizing waste, with uncertainty about the actual consumption by costumers. This problem is of a high relevance today, given the growing demand for food and the impact of food waste on the environment, the economy, and purchasing power. We frame inventory restocking as a new reinforcement learning task that exhibits stochastic behavior conditioned on the agent's actions, making the environment partially observable. We make two main contributions. First, we introduce a new reinforcement learning environment, RetaiL, based on real grocery store data and expert knowledge. This environment is highly stochastic, and presents a unique challenge for reinforcement learning practitioners. We show that uncertainty about the future behavior of the environment is not handled well by classical supply chain algorithms, and that distributional approaches are a good way to account for the uncertainty. Second, we introduce GTDQN, a distributional reinforcement learning algorithm that learns a generalized Tukey Lambda distribution over the reward space. GTDQN provides a strong baseline for our environment. It outperforms other distributional reinforcement learning approaches in this partially observable setting, in both overall reward and reduction of generated waste.

A Reinforcement Learning Method for Inventory Control under State-based Stochastic Demand

Inventory Optimization of Manufacturing/remanufacturing Hybrid System with Stochastic Leadtime

Dynamic inventory replenishment strategy for aerospace manufacturing supply chain: combining reinforcement learning and multi-agent simulation

Deep Inventory Management

Case-based reinforcement learning for dynamic inventory control in a multi-agent supply-chain system

Can Deep Reinforcement Learning Improve Inventory Management? Performance on Dual Sourcing, Lost Sales and Multi-Echelon Problems

Solving Inventory Management Problems Through Deep Reinforcement Learning

Deep Reinforcement Learning for inventory optimization with non-stationary uncertain demand

Reinforcement Learning Provides a Flexible Approach for Realistic Supply Chain Safety Stock Optimisation

Scalable multi-product inventory control with lead time constraints using reinforcement learning

Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning

Deep Reinforcement Learning for Large-Scale Inventory Management

Cooperative Multi-Agent Reinforcement Learning for Inventory Management

Algorithmic Approaches to Inventory Management Optimization

Deep Reinforcement Learning Approach for Capacitated Supply Chain optimization under Demand Uncertainty

Reinforcement Learning with Intrinsically Motivated Feedback Graph for Lost-sales Inventory Control

Reinforcement Learning for Optimizing Can-Order Policy with the Rolling Horizon Method

A Simulation Environment and Reinforcement Learning Method for Waste Reduction

Interpretable Reinforcement Learning via Neural Additive Models for Inventory Management

Implementing Reinforcement Learning Algorithms in Retail Supply Chains with OpenAI Gym Toolkit

Reinforcement Learning for Multi-Product Multi-Node Inventory Management in Supply Chains