SeqRFM: Fast RFM Analysis in Sequence Data

Yanxin Zheng,Wensheng Gan,Zefeng Chen,Pinlyu Zhou,Philippe Fournier-Viger
2024-11-08
Abstract:In recent years, data mining technologies have been well applied to many domains, including e-commerce. In customer relationship management (CRM), the RFM analysis model is one of the most effective approaches to increase the profits of major enterprises. However, with the rapid development of e-commerce, the diversity and abundance of e-commerce data pose a challenge to mining efficiency. Moreover, in actual market transactions, the chronological order of transactions reflects customer behavior and preferences. To address these challenges, we develop an effective algorithm called SeqRFM, which combines sequential pattern mining with RFM models. SeqRFM considers each customer's recency (R), frequency (F), and monetary (M) scores to represent the significance of the customer and identifies sequences with high recency, high frequency, and high monetary value. A series of experiments demonstrate the superiority and effectiveness of the SeqRFM algorithm compared to the most advanced RFM algorithms based on sequential pattern mining. The source code and datasets are available at GitHub <a class="link-external link-https" href="https://github.com/DSI-Lab1/SeqRFM" rel="external noopener nofollow">this https URL</a>.
Databases
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to improve the efficiency and accuracy of RFM analysis in sequential data. Specifically, the paper proposes a new algorithm named SeqRFM, aiming to combine sequence pattern mining (SPM) with the RFM model to identify customer behavior patterns with high recency, high frequency, and high monetary value. These patterns are particularly important for customer relationship management (CRM) in e - commerce, because they can help enterprises better understand customer behavior and preferences, and thus formulate more effective marketing strategies. The paper points out that with the development of e - commerce, the diversity and richness of data pose challenges to the efficiency of data mining. Although traditional RFM analysis methods are effective, they have limitations when dealing with large - scale and complex data sets. In addition, the actual order of market transactions reflects customer behavior and preferences, which are often overlooked in traditional methods. Therefore, the SeqRFM algorithm represents the importance of customers by considering each customer's recency, frequency, and monetary scores, and identifies sequences with high recency, high frequency, and high monetary value. To achieve this goal, the SeqRFM algorithm makes the following key contributions: 1. Propose three revised definitions of RFM pattern dimensions, which are more consistent with existing pattern - mining algorithms. 2. Develop a new algorithm - SeqRFM, and a new data structure - RFM - Tree, to store auxiliary information, and design multiple pruning strategies to reduce the search space, specifically for mining compact RFM patterns in sequence databases. 3. Apply the maximum - check strategy in SeqRFM to mine the maximum RFM patterns to obtain a compressed set of mining results. 4. Experimental results show that the proposed SeqRFM algorithm is superior to the existing state - of - the - art algorithms in terms of precision and effectiveness. In conclusion, through the introduction of the SeqRFM algorithm, this paper solves the problem of efficient and accurate RFM analysis in large - scale sequence data, which is of great significance for improving customer relationship management and marketing strategies in the e - commerce field.