User Response and Sentiment Prediction for Automatic Dialogue Evaluation

Sarik Ghazarian,Behnam Hedayatnia,Alexandros Papangelis,Yang Liu,Dilek Hakkani-Tur
DOI: https://doi.org/10.48550/arXiv.2111.08808
2022-02-17
Abstract:Automatic evaluation is beneficial for open-domain dialog system development. However, standard word-overlap metrics (BLEU, ROUGE) do not correlate well with human judgements of open-domain dialog systems. In this work we propose to use the sentiment of the next user utterance for turn or dialog level evaluation. Specifically we propose three methods: one that predicts the next sentiment directly, and two others that predict the next user utterance using an utterance or a feedback generator model and then classify its sentiment. Experiments show our model outperforming existing automatic evaluation metrics on both written and spoken open-domain dialogue datasets.
Computation and Language
What problem does this paper attempt to address?