M5 Competition Uncertainty: Overdispersion, distributional forecasting, GAMLSS and beyond

Florian Ziel
DOI: https://doi.org/10.1016/j.ijforecast.2021.09.008
2021-11-10
Abstract:The M5 competition uncertainty track aims for probabilistic forecasting of sales of thousands of Walmart retail goods. We show that the M5 competition data faces strong overdispersion and sporadic demand, especially zero demand. We discuss resulting modeling issues concerning adequate probabilistic forecasting of such count data processes. Unfortunately, the majority of popular prediction methods used in the M5 competition (e.g. lightgbm and xgboost GBMs) fails to address the data characteristics due to the considered objective functions. The distributional forecasting provides a suitable modeling approach for to the overcome those problems. The GAMLSS framework allows flexible probabilistic forecasting using low dimensional distributions. We illustrate, how the GAMLSS approach can be applied for the M5 competition data by modeling the location and scale parameter of various distributions, e.g. the negative binomial distribution. Finally, we discuss software packages for distributional modeling and their drawback, like the R package gamlss with its package extensions, and (deep) distributional forecasting libraries such as TensorFlow Probability.
Machine Learning,Artificial Intelligence,Applications,Methodology
What problem does this paper attempt to address?