Applying graph neural network to SupplyGraph for supply chain network

Kihwan Han
2024-08-24
Abstract:Supply chain networks describe interactions between products, manufacture facilities, storages in the context of supply and demand of the products. Supply chain data are inherently under graph structure; thus, it can be fertile ground for applications of graph neural network (GNN). Very recently, supply chain dataset, SupplyGraph, has been released to the public. Though the SupplyGraph dataset is valuable given scarcity of publicly available data, there was less clarity on description of the dataset, data quality assurance process, and hyperparameters of the selected models. Further, for generalizability of findings, it would be more convincing to present the findings by performing statistical analyses on the distribution of errors rather than showing the average value of the errors. Therefore, this study assessed the supply chain dataset, SupplyGraph, with better clarity on analyses processes, data quality assurance, machine learning (ML) model specifications. After data quality assurance procedures, this study compared performance of Multilayer Perceptions (MLP), Graph Convolution Network (GCN), and Graph Attention Network (GAT) on a demanding forecasting task while matching hyperparameters as feasible as possible. The analyses revealed that GAT performed best, followed by GCN and MLP. Those performance improvements were statistically significant at $\alpha = 0.05$ after correction for multiple comparisons. This study also discussed several considerations in applying GNN to supply chain networks. The current study reinforces the previous study in supply chain benchmark dataset with respect to description of the dataset and methodology, so that the future research in applications of GNN to supply chain becomes more reproducible.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to evaluate and improve the application of Graph Neural Networks (GNNs) in supply chain networks, especially their performance in demand forecasting tasks. Specifically, the research aims to: 1. **Improve the clarity of dataset description**: Although the SupplyGraph dataset is valuable, previous studies lack sufficient transparency in dataset description, data quality assurance processes, and model hyperparameters. Therefore, this study conducts a more detailed analysis and description of the SupplyGraph dataset. 2. **Ensure data quality**: Through a series of data quality assurance measures, such as removing duplicate nodes and edges, to ensure the reliability of the dataset used in the experiment. 3. **Compare the performance of different models**: The research compares the performance of Multilayer Perceptron (MLP), Graph Convolutional Network (GCN), and Graph Attention Network (GAT) in demand forecasting tasks to verify whether GNNs are superior to traditional deep - learning methods (such as MLP). 4. **Statistically analyze the error distribution**: To make the results more convincing, the research not only shows the average error value but also evaluates the error distribution of different models through statistical analysis (such as Kruskal - Wallis H - test and Wilcoxon - Mann - Whitney U - test). 5. **Explore the application considerations of GNNs in the supply chain**: The research discusses the factors that need to be considered when applying GNNs to supply chain networks, such as the selection of graph structures, the over - smoothing problem, etc. Through these efforts, the research finally proves the superiority of GNNs in supply chain demand forecasting tasks and provides reproducible methods and dataset descriptions for future research. ### Formula Summary The formulas involved in the paper mainly include: - **Mean Squared Error (MSE)**: \[ MSE=\frac{1}{n}\sum_{i = 1}^{n}(y_{i}-\hat{y}_{i})^{2} \] where \(y_{i}\) is the true value, \(\hat{y}_{i}\) is the predicted value, and \(n\) is the number of samples. - **Kruskal - Wallis H - test**: \[ H=\frac{12}{N(N + 1)}\sum_{i = 1}^{k}\frac{R_{i}^{2}}{n_{i}}-3(N + 1) \] where \(N\) is the total number of samples, \(k\) is the number of groups, \(R_{i}\) is the rank sum of the \(i\)-th group, and \(n_{i}\) is the number of samples in the \(i\)-th group. - **Wilcoxon - Mann - Whitney U - test**: \[ U_{1}=n_{1}n_{2}+\frac{n_{1}(n_{1}+1)}{2}-R_{1} \] where \(n_{1}\) and \(n_{2}\) are the number of samples in the two groups respectively, and \(R_{1}\) is the rank sum of the first group. These formulas are used to evaluate model performance and statistical significance, ensuring the reliability and scientific nature of the research results.