Identification of Customer Churn Considering Difficult Case Mining

Jianfeng Li,Xue Bai,Qian Xu,Dexiang Yang
DOI: https://doi.org/10.3390/systems11070325
2023-06-25
Systems
Abstract:In the process of user churn modeling, due to the imbalance between lost users and retained users, the use of traditional classification models often cannot accurately and comprehensively identify users with churn tendency. To address this issue, it is not sufficient to simply increase the misclassification cost of minority class samples in cost-sensitive methods. This paper proposes using the Focal Loss hard example mining technique to add the class weight α and the focus parameter γ to the cross-entropy loss function of LightGBM. In addition, it emphasizes the identification of customers at risk of churning and raises the cost of misclassification for minority and difficult-to-classify samples. On the basis of the preceding ideas, the FocalLoss_LightGBM model is proposed, along with random forests, SVM, XGBoost, and LightGBM. Empirical analysis based on a dataset of credit card users publicly available on the Kaggle website. The AUC, TPR, and G-mean index values were superior to the existing model, which can effectively improve the accuracy and stability of potential lost users.
What problem does this paper attempt to address?