Forecast of seasonal consumption behavior of consumers and privacy-preserving data mining with new S-Apriori algorithm

Duy Thanh Tran,Jun-Ho Huh
DOI: https://doi.org/10.1007/s11227-023-05105-6
IF: 3.3
2023-03-18
The Journal of Supercomputing
Abstract:Nowadays, supermarkets and retail stores all use software systems with databases to store customer transactions. Over time, the volume of data is also increasing and it contributes a lot of hidden value in this data warehouse, mining data from historical transactions will find out the buying patterns and behavior of consumers, which can assist in improving sales by reaching customers more precisely. Data-mining techniques allow us to exploit synthetic information in many aspects, such as association rules for statistics and decision support in many fields. Most users of e-commerce systems or web platforms are concerned about privacy protection, such as privacy requirements for name, occupation, age, interests, residence, or sales transactions on the e-commerce system. Therefore, protecting the privacy of electronic service users in data mining is also an important factor that needs to be considered. For those important reasons, the Apriori algorithm was researched and extrapolated into a new S-Apriori algorithm for the concept of seasonal shopping. This paper applied the S-Apriori, ORM model, SQL language, and C# to build the libraries for the forecast of Seasonal Consumption Behavior of Consumers. Also, a new Thanh and Huh Cryptography algorithm for privacy-preserving filters is proposed for data-mining processing privacy protection. The paper experimented on two datasets based on a small dataset with 37 records and the Adventure large dataset of Microsoft with 172,459 records, while the software provides association rules with the corresponding confidence ratio for users to easily make decisions. In addition, the model will be packaged and published to the Microsoft Nuget ecosystem, developers and researchers can use it to develop association rule mining systems or further extend it based on the new S-Apriori model.
computer science, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?