Enhancing the Open-Domain Dialogue Evaluation in Latent Space

Zhangming Chan,Lemao Liu,Juntao Li,Haisong Zhang,Dongyan Zhao,Shuming Shi,Rui Yan
DOI: https://doi.org/10.18653/v1/2021.findings-acl.432
2021-01-01
Abstract:The notorious one-to-many nature of open-domain dialogues poses huge challenges for automatic evaluation methods. Recent studies attempt to mitigate this issue by considering the similarity of the generated response with the conversational context and design discriminative models to learn from multiple positive responses. Despite the promising results, they can not be applied to general scenarios where training data with multiple responses is unavailable. To this end, in this paper, we propose a self-supervised setting to obtain a smooth latent space that can both capture discourse-level context information and implicitly model more references in latent space. Specifically, we present EMS, an Enhanced dialogue evaluation Metric in latent Space. Experimental results on two real-world dialogue datasets confirm the superiority of our method for open-domain dialogue evaluation, where both Pearson and Spearman correlations with human judgments outperform all baselines.
What problem does this paper attempt to address?