Abstract:Being able to reply with a related, fluent, and informative response is an indispensable requirement for building high-quality conversational agents. In order to generate better responses, some approaches have been proposed, such as feeding extra information by collecting large-scale datasets with human annotations, designing neural conversational models (NCMs) with complex architecture and loss functions, or filtering out untrustworthy samples based on a dialogue attribute, e.g., Relatedness or Genericness. In this paper, we follow the third research branch and present a data filtering method for open-domain dialogues, which identifies untrustworthy samples from training data with a quality measure that linearly combines seven dialogue attributes. The attribute weights are obtained via Bayesian Optimization (BayesOpt) that aims to optimize an objective function for dialogue generation iteratively on the validation set. Then we score training samples with the quality measure, sort them in descending order, and filter out those at the bottom. Furthermore, to accelerate the "filter-train-evaluate" iterations involved in BayesOpt on large-scale datasets, we propose a training framework that integrates maximum likelihood estimation (MLE) and negative training method (NEG). The training method updates parameters of a trained NCMs on two small sets with newly maintained and removed samples, respectively. Specifically, MLE is applied to maximize the log-likelihood of newly maintained samples, while NEG is used to minimize the log-likelihood of newly removed ones. Experimental results on two datasets show that our method can effectively identify untrustworthy samples, and NCMs trained on the filtered datasets achieve better performance.

HERALD: An Annotation Efficient Method to Detect User Disengagement in Social Conversations

Hagan: Hierarchical Attentive Adversarial Learning For Task-Oriented Dialogue System

Human-centred Design on Crowdsourcing Annotation Towards Improving Active Learning Model Performance

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

LIDA: Lightweight Interactive Dialogue Annotator

CAUSE: Counterfactual Assessment of User Satisfaction Estimation in Task-Oriented Dialogue Systems

An Efficient Self-Learning Framework For Interactive Spoken Dialog Systems

DAT: Dialogue-Aware Transformer with Modality-Group Fusion for Human Engagement Estimation

Interaction Matters: An Evaluation Framework for Interactive Dialogue Assessment on English Second Language Conversations

HIPPL: Hierarchical Intent-Inferring Pointer Network With Pseudo Labeling for Consistent Persona-Driven Dialogue Generation

Auto-Dialabel: Labeling Dialogue Data with Unsupervised Learning

End-to-End Trainable Non-Collaborative Dialog System

Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs

Exploring the Roles of NLP-based Dialog Indicators in Predicting User Experience in interacting with Large Language Model System

Zero-Shot Dialogue Disentanglement by Self-Supervised Entangled Response Selection

Enhancing the Open-Domain Dialogue Evaluation in Latent Space

Dynamic Causal Disentanglement Model for Dialogue Emotion Detection

DialogUSR: Complex Dialogue Utterance Splitting and Reformulation for Multiple Intent Detection

Understanding User Satisfaction with Task-oriented Dialogue Systems

Identifying Untrustworthy Samples: Data Filtering for Open-domain Dialogues with Bayesian Optimization

Improving (Dis)agreement Detection with Inductive Social Relation Information From Comment-Reply Interactions