Modeling the Telemarketing Process using Genetic Algorithms and Extreme Boosting: Feature Selection and Cost-Sensitive Analytical Approach

Nazeeh Ghatasheh,Ismail Altaharwa,Khaled Aldebei
DOI: https://doi.org/10.1109/ACCESS.2023.3292840
2023-10-30
Abstract:Currently, almost all direct marketing activities take place virtually rather than in person, weakening interpersonal skills at an alarming pace. Furthermore, businesses have been striving to sense and foster the tendency of their clients to accept a marketing offer. The digital transformation and the increased virtual presence forced firms to seek novel marketing research approaches. This research aims at leveraging the power of telemarketing data in modeling the willingness of clients to make a term deposit and finding the most significant characteristics of the clients. Real-world data from a Portuguese bank and national socio-economic metrics are used to model the telemarketing decision-making process. This research makes two key contributions. First, propose a novel genetic algorithm-based classifier to select the best discriminating features and tune classifier parameters simultaneously. Second, build an explainable prediction model. The best-generated classification models were intensively validated using 50 times repeated 10-fold stratified cross-validation and the selected features have been analyzed. The models significantly outperform the related works in terms of class of interest accuracy, they attained an average of 89.07\% and 0.059 in terms of geometric mean and type I error respectively. The model is expected to maximize the potential profit margin at the least possible cost and provide more insights to support marketing decision-making.
Machine Learning,Artificial Intelligence,Neural and Evolutionary Computing
What problem does this paper attempt to address?
The main problem this paper attempts to address is: In the context of the current digital transformation and the increasing number of virtual marketing activities, companies are increasingly relying on remote methods such as telemarketing to understand and predict customers' purchase intentions. However, due to data imbalance (i.e., the number of customers willing to accept marketing proposals is much smaller than those who are not), data missing, and the influence of external factors (such as market dynamics, economic stability, etc.), building effective predictive models becomes very challenging. Moreover, traditional marketing analysis methods often fail to comprehensively handle these complex data characteristics. To address these challenges, this study proposes a new method based on Genetic Algorithm (GA), aiming to simultaneously select the best distinguishing features and optimize classifier parameters. Specifically, the goals of this study include: 1. **Develop a high-performance predictive model**: Capable of accurately predicting whether customers are willing to accept the offer of a term deposit. 2. **Identify the most significant customer features**: Find out the key factors that influence customers' acceptance of marketing proposals. 3. **Maximize potential profit**: Achieve the maximum profit margin at the lowest cost. 4. **Support marketing decisions**: Provide more insights to support the company's marketing decision-making process. Using real data from a Portuguese bank and national socio-economic indicators, this study constructs an interpretable predictive model and validates the generated optimal classification model through 50 repetitions of 10-fold stratified cross-validation. The results show that the model significantly outperforms related work in terms of accuracy for the target category, achieving a geometric mean of 89.07% and a Type I error rate of 0.059. This indicates that the method not only improves predictive performance but also provides more valuable business insights.