Abstract:The COVID-19 pandemic has highlighted the importance of supply chains and the role of digital management to react to dynamic changes in the environment. In this work, we focus on developing dynamic inventory ordering policies for a multi-echelon, i.e. multi-stage, supply chain. Traditional inventory optimization methods aim to determine a static reordering policy. Thus, these policies are not able to adjust to dynamic changes such as those observed during the COVID-19 crisis. On the other hand, conventional strategies offer the advantage of being interpretable, which is a crucial feature for supply chain managers in order to communicate decisions to their stakeholders. To address this limitation, we propose an interpretable reinforcement learning approach that aims to be as interpretable as the traditional static policies while being as flexible and environment-agnostic as other deep learning-based reinforcement learning solutions. We propose to use Neural Additive Models as an interpretable dynamic policy of a reinforcement learning agent, showing that this approach is competitive with a standard full connected policy. Finally, we use the interpretability property to gain insights into a complex ordering strategy for a simple, linear three-echelon inventory supply chain.

What problem does this paper attempt to address?

The paper aims to address the inventory optimization problem in multi-echelon supply chains. Specifically, it focuses on developing dynamic inventory ordering strategies to cope with dynamic changes in the environment, such as demand fluctuations and supply chain disruptions during the COVID-19 pandemic. Traditional inventory optimization methods are usually based on static reorder strategies, which cannot adapt to rapidly changing environments. However, one advantage of traditional strategies is their interpretability, which is crucial for supply chain managers to communicate decisions with stakeholders. To address this limitation, the paper proposes an interpretable reinforcement learning approach based on Neural Additive Models (NAM). This method aims to maintain the interpretability of traditional static strategies while possessing the flexibility and adaptability to the environment of other deep learning-based reinforcement learning solutions. By using NAM as the interpretable dynamic strategy for the reinforcement learning agent, the paper demonstrates that this approach can match the performance of standard fully connected strategies and leverages its interpretability features to gain insights into the mechanisms behind complex ordering strategies. The paper illustrates how to extract shape functions from the trained NAM using a simple linear three-stage inventory supply chain example, thereby helping supply chain managers understand the impact of each feature on reorder quantity decisions. In this way, the paper provides a new approach that makes complex neural strategies more transparent and easier to understand in practical applications.

Interpretable Reinforcement Learning via Neural Additive Models for Inventory Management