Abstract:Neural end-to-end goal-oriented dialog systems showed promise to reduce the workload of human agents for customer service, as well as reduce wait time for users. However, their inability to handle new user behavior at deployment has limited their usage in real world. In this work, we propose an end-to-end trainable method for neural goal-oriented dialog systems which handles new user behaviors at deployment by transferring the dialog to a human agent intelligently. The proposed method has three goals: 1) maximize user's task success by transferring to human agents, 2) minimize the load on the human agents by transferring to them only when it is essential and 3) learn online from the human agent's responses to reduce human agents load further. We evaluate our proposed method on a modified-bAbI dialog task that simulates the scenario of new user behaviors occurring at test time. Experimental results show that our proposed method is effective in achieving the desired goals.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that in practical deployment, the neural end - to - end goal - oriented dialogue system is unable to handle new user behaviors. Specifically, when these systems encounter new user behaviors that did not occur during the training process, they may fail, resulting in the inability to complete the user's task. Such a failure will not only affect the current user's experience, but may also damage the enterprise's reputation and influence future users. Therefore, the paper proposes a method aimed at improving the robustness and practicality of the system through the following three goals: 1. **Maximize the success rate of user tasks**: By intelligently transferring the dialogue to a human agent in cases where the system may fail, ensure that the user's task can be completed smoothly. 2. **Minimize the use of human agents**: Transfer the dialogue to a human agent only when necessary to reduce dependence on human agents. 3. **Online learning from the responses of human agents**: By learning from the responses of human agents, gradually reduce the dependence on human agents in the future. To achieve these goals, the paper proposes an end - to - end trainable method that combines a neural dialogue model \(M\) and a neural classifier \(C\). The classifier \(C\) can determine whether it is necessary to transfer the dialogue to a human agent according to the current dialogue state, and after the transfer, the system will learn from the responses of the human agent so that it can handle similar situations independently in the future. In addition, the paper also designs a reward function to train the classifier \(C\) through reinforcement learning (RL), enabling it to find a balance between maximizing the success rate of user tasks and minimizing the use of human agents. The paper conducted experiments on the modified bAbI dialogue task to verify the effectiveness of the proposed method. The experimental results show that, compared with the baseline method, the proposed method significantly improves the success rate of user tasks while reducing dependence on human agents.

Learning End-to-End Goal-Oriented Dialog with Maximal User Task Success and Minimal Human Agent Use

Hagan: Hierarchical Attentive Adversarial Learning For Task-Oriented Dialogue System

Teaching Machines to Converse

Dialogue Learning with Human-in-the-Loop.

Learning End-to-End Goal-Oriented Dialog with Multiple Answers

Building Advanced Dialogue Managers for Goal-Oriented Dialogue Systems

Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems

Learning to Learn End-to-End Goal-Oriented Dialog From Related Dialog Tasks

Bootstrapping Adaptive Human-Machine Interfaces with Offline Reinforcement Learning

Effects of Naturalistic Variation in Goal-Oriented Dialog

A Network-based End-to-End Trainable Task-oriented Dialogue System

Towards End-to-End Learning for Efficient Dialogue Agent by Modeling Looking-ahead Ability

Conversation Learner -- A Machine Teaching Tool for Building Dialog Managers for Task-Oriented Dialog Systems

Towards a Progression-Aware Autonomous Dialogue Agent

Training Zero-Shot Generalizable End-to-End Task-Oriented Dialog System Without Turn-level Dialog Annotations

Simulating User Agents for Embodied Conversational-AI

SUMBT+LaRL: Effective Multi-domain End-to-end Neural Task-oriented Dialog System

Improving Proactive Dialog Agents Using Socially-Aware Reinforcement Learning

Interactive Dialogue Agents via Reinforcement Learning on Hindsight Regenerations

Enhancing End-to-End Multi-Task Dialogue Systems: A Study on Intrinsic Motivation Reinforcement Learning Algorithms for Improved Training and Adaptability

End-to-End Optimization of Task-Oriented Dialogue Model with Deep Reinforcement Learning