Abstract:For many call centers, customer satisfaction (CSAT) is a key performance indicator (KPI). However, only a fraction of customers take the CSAT survey after the call, leading to a biased and inaccurate average CSAT value, and missed opportunities for coaching, follow-up, and rectification. Therefore, call centers can benefit from a model predicting customer satisfaction on calls where the customer did not complete the survey. Given that CSAT is a closely monitored KPI, it is critical to minimize any bias in the average predicted CSAT (pCSAT). In this paper, we introduce a method such that predicted CSAT (pCSAT) scores accurately replicate the distribution of survey CSAT responses for every call center with sufficient data in a live production environment. The method can be applied to many multiclass classification problems to improve the class balance and minimize its changes upon model updates.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to solve several key problems in call - center customer satisfaction (CSAT) prediction: 1. **Bias Caused by Low Response Rate**: - After a call ends, only a small number of customers will complete the CSAT survey, and the average response rate is usually only 8%. This leads to bias and inaccuracy in CSAT scores, which in turn affects the evaluation of call - center performance. - Since the satisfaction of non - responding customers is unknown, this low response rate may lead to misjudgment of call - center performance. 2. **Accurately Predict the Satisfaction of Customers Who Have Not Completed the Survey**: - The paper proposes a method that can predict the satisfaction (pCSAT) of customers who have not completed the survey, in order to make up for the data loss caused by the low response rate. - The key is to ensure that these predicted pCSAT scores can accurately replicate the distribution of actual CSAT survey responses, thereby reducing bias and improving the reliability of prediction. 3. **Class - Distribution Changes Caused by Model Updates**: - When a machine - learning model is updated, it may change the relative balance of output categories, leading to unexpected results. Therefore, a control mechanism is needed to ensure that after the model is updated, the predicted pCSAT distribution remains consistent with the actual CSAT distribution. - The paper proposes a method of optimizing the threshold to ensure that the predicted pCSAT distribution is as close as possible to the actual CSAT distribution, thereby minimizing the impact of model updates. 4. **Category Balance in Multi - Classification Problems**: - This method is not only applicable to CSAT prediction, but can also be applied to other multi - classification problems to improve category balance and minimize changes during model updates. By solving these problems, the paper hopes to provide a more accurate and reliable CSAT prediction method to help call centers better evaluate performance and improve services. ### Formula Summary To ensure that pCSAT scores can accurately replicate the distribution of actual CSAT survey responses, the paper introduces a loss function to optimize the decision threshold: \[ \text{Loss} = \Delta \%_{p,c} + \Delta \text{avg}_{p,c} + \text{MSE}_{p,c} \] where: - \(\Delta \%_{p,c}\) represents the difference between the pCSAT and CSAT satisfaction percentages: \[ \Delta \%_{p,c} = \left| (\% \text{ of } pCSAT \geq 4) - (\% \text{ of } CSAT \geq 4) \right| \] - \(\Delta \text{avg}_{p,c}\) represents the difference between the pCSAT and CSAT averages: \[ \Delta \text{avg}_{p,c} = \left| \text{avg\_pcsat} - \text{avg\_csat} \right| \] - \(\text{MSE}_{p,c}\) represents the mean - squared error between the normalized pCSAT and CSAT distributions: \[ \text{MSE}_{p,c} = \text{MSE}(\vec{\text{pcsat}}, \vec{\text{csat}}) \] By minimizing this loss function, the optimal decision threshold can be found, thereby ensuring that the predicted pCSAT distribution is as close as possible to the actual CSAT distribution.

Predicting Customer Satisfaction by Replicating the Survey Response Distribution

What can be learned from satisfaction assessments?

A Two-stage Attention-based Model for Customer Satisfaction Prediction in E-commerce Customer Service

Unsatisfied Today, Satisfied Tomorrow: a simulation framework for performance evaluation of crowdsourcing-based network monitoring

Positivity Bias in Customer Satisfaction Ratings

Prediction scoring of data-driven discoveries for reproducible research

Predicting the Reasons of Customer Complaints: A First Step Toward Anticipating Quality Issues of In Vitro Diagnostics Assays with Machine Learning

Cumulative Probability Distribution Model for Evaluating User Behavior Prediction Algorithms

CBReT: A Cluster-Based Resampling Technique for dealing with imbalanced data in code smell prediction

Designing Decision Support Systems Using Counterfactual Prediction Sets

Enhancing Customer Churn Prediction in the Banking Sector through Hybrid Segmented Models with Model-Agnostic Interpretability Techniques

Imbalanced customer churn classification using a new multi-strategy collaborative processing method

Predicting Patient No-Shows in Community Health Clinics: A Case Study in Designing a Data Analytic Product

Evaluating Binary Outcome Classifiers Estimated from Survey Data

Utilizing data sampling techniques on algorithmic fairness for customer churn prediction with data imbalance problems

Effective Prediction of Online Reviews for Improvement of Customer Recommendation Services by Hybrid Classification Approach

Sequential One-step Estimator by Sub-sampling for Customer Churn Analysis with Massive Data Sets

A Case Study on a Sustainable Framework for Ethically Aware Predictive Modeling

A comparison of approaches to improve worst-case predictive model performance over patient subpopulations

Prediction and optimization of employee turnover intentions in enterprises based on unbalanced data

A Data-centric Solution to Improve Online Performance of Customer Service Bots.