Use of a Data-driven Approach to Build a White-box Model of the Relationship Between Sales and Sales Rank

Xian Yang,Zhijie Lin,Yuanfeng Cai
DOI: https://doi.org/10.2139/ssrn.4299571
2022-01-01
SSRN Electronic Journal
Abstract:Actual sales data are useful to both practitioners and researchers. However, most online e-commerce websites publish only sales ranks, due to privacy concerns. In some studies, sales ranks are converted into sales data using a sales–rank relationship model, such as the frequently applied power-law model. In this paper, we propose a novel data-driven approach to determine an intelligent model of the sales–rank relationship. By analyzing the comprehensive data that covers the products from top-ranked to bottom-ranked, we find a new model that outperforms the traditional power-law model and better describes the long tail phenomenon. The fitting performance, R2, of the new model for 38,037 products in 20 categories from a retailer’s website is increased by an average of 8.28% compared with the traditional model. We also compare the models using data from a hotel-booking website and show that the R2 of the new model for 14,072 hotels in five cities is increased by an average of 31.23%. When using the sales rank to predict sales, the new model yields a 64.32% increase in average prediction accuracy relative to the traditional model. These results demonstrate that the proposed approach can be used to intelligently learn the generalized knowledge underlying a large dataset and precisely determine the exact relationship model in a white-box manner.
What problem does this paper attempt to address?