Supervised ensemble sentiment-based framework to measure chatbot quality of services

Ebtesam Hussain Almansor,Farookh Khadeer Hussain,Omar Khadeer Hussain
DOI: https://doi.org/10.1007/s00607-020-00863-0
2020-11-04
Computing
Abstract:Developing an intelligent chatbot has evolved in the last few years to become a trending topic in the area of computer science. However, a chatbot often fails to understand the user's intent, which can lead to the generation of inappropriate responses that cause dialogue breakdown and user dissatisfaction. Detecting the dialogue breakdown is essential to improve the performance of the chatbot and increase user satisfaction. Recent approaches have focused on modeling conversation breakdown using serveral approaches, including supervised and unsupervised approaches. Unsupervised approach relay heavy datasets, which make it challenging to apply it to the breakdown task. Another challenge facing predicting breakdown in conversation is the bias of human annotation for the dataset and the handling process for the breakdown. To tackle this challenge, we have developed a supervised ensemble automated approach that measures Chatbot Quality of Service (CQoS) based on dialogue breakdown. The proposed approach is able to label the datasets based on sentiment considering the context of the conversion to predict the breakdown. In this paper we aim to detect the affect of sentiment change of each speaker in a conversation. Furthermore, we use the supervised ensemble model to measure the CQoS based on breakdown. Then we handle this problem by using a hand-over mechanism that transfers the user to a live agent. Based on this idea, we perform several experiments across several datasets and state-of-the-art models, and we find that using sentiment as a trigger for breakdown outperforms human annotation. Overall, we infer that knowledge acquired from the supervised ensemble model can indeed help to measure CQoS based on detecting the breakdown in conversation.
computer science, theory & methods
What problem does this paper attempt to address?