An Exploration of Clustering Algorithms for Customer Segmentation in the UK Retail Market

Jeen Mary John, Olamilekan Shobayo, Bayode Ogunleye
DOI: https://doi.org/10.3390/analytics2040042
2024-02-07
Abstract:Recently, peoples awareness of online purchases has significantly risen. This has given rise to online retail platforms and the need for a better understanding of customer purchasing behaviour. Retail companies are pressed with the need to deal with a high volume of customer purchases, which requires sophisticated approaches to perform more accurate and efficient customer segmentation. Customer segmentation is a marketing analytical tool that aids customer-centric service and thus enhances profitability. In this paper, we aim to develop a customer segmentation model to improve decision-making processes in the retail market industry. To achieve this, we employed a UK-based online retail dataset obtained from the UCI machine learning repository. The retail dataset consists of 541,909 customer records and eight features. Our study adopted the RFM (recency, frequency, and monetary) framework to quantify customer values. Thereafter, we compared several state-of-the-art (SOTA) clustering algorithms, namely, K-means clustering, the Gaussian mixture model (GMM), density-based spatial clustering of applications with noise (DBSCAN), agglomerative clustering, and balanced iterative reducing and clustering using hierarchies (BIRCH). The results showed the GMM outperformed other approaches, with a Silhouette Score of 0.80.
Machine Learning,Applications,Artificial Intelligence,Computation
What problem does this paper attempt to address?
The main aim of this paper is to address the issue of customer segmentation in the retail market to improve decision-making processes and enhance the profitability of the market industry. Specifically, the research objectives include: 1. **Developing a customer segmentation model**: To better understand customer purchasing behavior and perform effective customer segmentation, the authors developed a customer segmentation model. 2. **Comparing the effectiveness of different clustering algorithms**: In this study, researchers employed several advanced unsupervised machine learning clustering algorithms and conducted comparative analyses to determine which algorithm performs best in customer segmentation. 3. **Applying the RFM framework to quantify customer value**: The study used the RFM (Recency, Frequency, and Monetary) framework to quantify customer value and classify customers based on this framework. 4. **Evaluating algorithm performance**: The performance of different clustering algorithms was evaluated using metrics such as the Silhouette Score to select the best algorithm. In summary, the core objective of this paper is to explore and compare the applicability and effectiveness of several different clustering algorithms (such as K-means, Gaussian Mixture Model (GMM), DBSCAN, Agglomerative Clustering, and BIRCH) in the task of customer segmentation in the UK retail market. The goal is to find a method that can more accurately perform customer segmentation, thereby helping businesses make better marketing decisions. The research results indicate that the Gaussian Mixture Model performs the best in this task.