Unlocking the Potential of User Feedback: Leveraging Large Language Model as User Simulator to Enhance Dialogue System

Zhiyuan Hu,Yue Feng,Anh Tuan Luu,Bryan Hooi,Aldo Lipani

DOI: https://doi.org/10.1145/3583780.3615220

2023-10-20

Abstract:Dialogue systems and large language models (LLMs) have gained considerable attention. However, the direct utilization of LLMs as task-oriented dialogue (TOD) models has been found to underperform compared to smaller task-specific models. Nonetheless, it is crucial to acknowledge the significant potential of LLMs and explore improved approaches for leveraging their impressive abilities. Motivated by the goal of leveraging LLMs, we propose an alternative approach called User-Guided Response Optimization (UGRO) to combine it with a smaller TOD model. This approach uses LLM as annotation-free user simulator to assess dialogue responses, combining them with smaller fine-tuned end-to-end TOD models. By utilizing the satisfaction feedback generated by LLMs, UGRO further optimizes the supervised fine-tuned TOD model. Specifically, the TOD model takes the dialogue history as input and, with the assistance of the user simulator's feedback, generates high-satisfaction responses that meet the user's requirements. Through empirical experiments on two TOD benchmarks, we validate the effectiveness of our method. The results demonstrate that our approach outperforms previous state-of-the-art (SOTA) results.

Computation and Language

What problem does this paper attempt to address?

The paper aims to address the poor performance of large language models (LLMs) in task-oriented dialogue systems (TOD) within specific domains. Specifically, the authors propose a new method called "User-Guided Response Optimization" (UGRO), which leverages LLMs as user simulators to evaluate the response quality of dialogue systems and provide feedback to optimize the TOD system. This approach overcomes the challenges encountered when directly using LLMs for TOD tasks, such as insufficient domain knowledge, lack of background information, and limited context understanding. Experiments demonstrate that the UGRO method significantly improves the performance of dialogue systems across multiple benchmark datasets.

Unlocking the Potential of User Feedback: Leveraging Large Language Model as User Simulator to Enhance Dialogue System

User Simulation with Large Language Models for Evaluating Task-Oriented Dialogue

Large Language Models as User-Agents for Evaluating Task-Oriented-Dialogue Systems

Learning through Dialogue Interactions by Asking Questions

Reliable LLM-based User Simulator for Task-Oriented Dialogue Systems

InstructTODS: Large Language Models for End-to-End Task-Oriented Dialogue Systems

Language Urban Odyssey: A Serious Game for Enhancing Second Language Acquisition Through Large Language Models

DuetSim: Building User Simulator with Dual Large Language Models for Task-Oriented Dialogues

Leveraging LLMs for Dialogue Quality Measurement

Improving Multi-Domain Task-Oriented Dialogue System with Offline Reinforcement Learning

CAUSE: Counterfactual Assessment of User Satisfaction Estimation in Task-Oriented Dialogue Systems

Sparse Rewards Can Self-Train Dialogue Agents

Is MultiWOZ a Solved Task? An Interactive TOD Evaluation Framework with User Simulator

Do Large Language Models with Reasoning and Acting Meet the Needs of Task-Oriented Dialogue?

Large Language Model based Situational Dialogues for Second Language Learning

Simulating Task-Oriented Dialogues with State Transition Graphs and Large Language Models

DialogBench: Evaluating LLMs as Human-like Dialogue Systems

DIALIGHT: Lightweight Multilingual Development and Evaluation of Task-Oriented Dialogue Systems with Large Language Models

Should We Fine-Tune or RAG? Evaluating Different Techniques to Adapt LLMs for Dialogue

Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk

Large Language Models as Zero-shot Dialogue State Tracker through Function Calling