A Retail Product Prediction Model Incorporating Machine Learning and Propensity Score Matching Method

Yuankun Li,Weili Kong,Xuexia Liang,Liangben Xu,Yuhua Mo,Kaidi Chen
DOI: https://doi.org/10.1109/CISAT62382.2024.10695432
2024-07-12
Abstract:Existing machine learning methods for retail product sales forecasting often rely on their own time series data and tend to ignore the correlation between the target retail product and other products. In this paper, we take cigarette product retail sales prediction as an example, and use the neighboring related alcohol sales data to predict cigarette product sales through the propensity score matching (PSM) method, KMeans++ clustering algorithm and XGBoost algorithm. Specifically, we use cigarette and alcohol sales data from Guangzhou, China for the years 2021-2022 as the study sample, and the results show 1) Compared with using only tobacco data and machine learning algorithms, the modeling accuracy improves from 82.79% to 91.34% by introducing neighboring alcohol sales data and combining PSM and machine learning XGBoost algorithms. 2) The machine learning XGBoost algorithm identifies successive important features of cigarette sales data: category, specification, price range, and revenue from neighboring alcohol sales (new features generated by the PSM and KMeans++ algorithms). The conclusions show that the PSM algorithm and the machine learning XGBoost algorithm can incorporate other retail products to predict target products and provide an empirical application for predicting retail sales based on associations with neighboring related products
Computer Science,Business,Economics
What problem does this paper attempt to address?