Predicting Customer Satisfaction by Replicating the Survey Response Distribution

Etienne Manderscheid,Matthias Lee
2024-11-19
Abstract:For many call centers, customer satisfaction (CSAT) is a key performance indicator (KPI). However, only a fraction of customers take the CSAT survey after the call, leading to a biased and inaccurate average CSAT value, and missed opportunities for coaching, follow-up, and rectification. Therefore, call centers can benefit from a model predicting customer satisfaction on calls where the customer did not complete the survey. Given that CSAT is a closely monitored KPI, it is critical to minimize any bias in the average predicted CSAT (pCSAT). In this paper, we introduce a method such that predicted CSAT (pCSAT) scores accurately replicate the distribution of survey CSAT responses for every call center with sufficient data in a live production environment. The method can be applied to many multiclass classification problems to improve the class balance and minimize its changes upon model updates.
Machine Learning,Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve several key problems in call - center customer satisfaction (CSAT) prediction: 1. **Bias Caused by Low Response Rate**: - After a call ends, only a small number of customers will complete the CSAT survey, and the average response rate is usually only 8%. This leads to bias and inaccuracy in CSAT scores, which in turn affects the evaluation of call - center performance. - Since the satisfaction of non - responding customers is unknown, this low response rate may lead to misjudgment of call - center performance. 2. **Accurately Predict the Satisfaction of Customers Who Have Not Completed the Survey**: - The paper proposes a method that can predict the satisfaction (pCSAT) of customers who have not completed the survey, in order to make up for the data loss caused by the low response rate. - The key is to ensure that these predicted pCSAT scores can accurately replicate the distribution of actual CSAT survey responses, thereby reducing bias and improving the reliability of prediction. 3. **Class - Distribution Changes Caused by Model Updates**: - When a machine - learning model is updated, it may change the relative balance of output categories, leading to unexpected results. Therefore, a control mechanism is needed to ensure that after the model is updated, the predicted pCSAT distribution remains consistent with the actual CSAT distribution. - The paper proposes a method of optimizing the threshold to ensure that the predicted pCSAT distribution is as close as possible to the actual CSAT distribution, thereby minimizing the impact of model updates. 4. **Category Balance in Multi - Classification Problems**: - This method is not only applicable to CSAT prediction, but can also be applied to other multi - classification problems to improve category balance and minimize changes during model updates. By solving these problems, the paper hopes to provide a more accurate and reliable CSAT prediction method to help call centers better evaluate performance and improve services. ### Formula Summary To ensure that pCSAT scores can accurately replicate the distribution of actual CSAT survey responses, the paper introduces a loss function to optimize the decision threshold: \[ \text{Loss} = \Delta \%_{p,c} + \Delta \text{avg}_{p,c} + \text{MSE}_{p,c} \] where: - \(\Delta \%_{p,c}\) represents the difference between the pCSAT and CSAT satisfaction percentages: \[ \Delta \%_{p,c} = \left| (\% \text{ of } pCSAT \geq 4) - (\% \text{ of } CSAT \geq 4) \right| \] - \(\Delta \text{avg}_{p,c}\) represents the difference between the pCSAT and CSAT averages: \[ \Delta \text{avg}_{p,c} = \left| \text{avg\_pcsat} - \text{avg\_csat} \right| \] - \(\text{MSE}_{p,c}\) represents the mean - squared error between the normalized pCSAT and CSAT distributions: \[ \text{MSE}_{p,c} = \text{MSE}(\vec{\text{pcsat}}, \vec{\text{csat}}) \] By minimizing this loss function, the optimal decision threshold can be found, thereby ensuring that the predicted pCSAT distribution is as close as possible to the actual CSAT distribution.