Doing Personal LAPS: LLM-Augmented Dialogue Construction for Personalized Multi-Session Conversational Search

Hideaki Joko,Shubham Chatterjee,Andrew Ramsay,Arjen P. de Vries,Jeff Dalton,Faegheh Hasibi

DOI: https://doi.org/10.1145/3626772.3657815

2024-05-06

Abstract:The future of conversational agents will provide users with personalized information responses. However, a significant challenge in developing models is the lack of large-scale dialogue datasets that span multiple sessions and reflect real-world user preferences. Previous approaches rely on experts in a wizard-of-oz setup that is difficult to scale, particularly for personalized tasks. Our method, LAPS, addresses this by using large language models (LLMs) to guide a single human worker in generating personalized dialogues. This method has proven to speed up the creation process and improve quality. LAPS can collect large-scale, human-written, multi-session, and multi-domain conversations, including extracting user preferences. When compared to existing datasets, LAPS-produced conversations are as natural and diverse as expert-created ones, which stays in contrast with fully synthetic methods. The collected dataset is suited to train preference extraction and personalized response generation. Our results show that responses generated explicitly using extracted preferences better match user's actual preferences, highlighting the value of using extracted preferences over simple dialogue history. Overall, LAPS introduces a new method to leverage LLMs to create realistic personalized conversational data more efficiently and effectively than previous methods.

Information Retrieval

What problem does this paper attempt to address?

The main problem this paper attempts to address is the challenge of developing personalized multi-session dialogue systems, particularly the lack of large-scale multi-session dialogue datasets that reflect real user preferences. Specifically: 1. **Lack of large-scale multi-session dialogue data**: Existing dialogue datasets are usually small in scale and mostly consist of single sessions, which cannot reflect the changes in user preferences across multiple sessions. 2. **Difficulty in generating high-quality dialogue data**: Traditional expert-generated methods are hard to scale, while fully synthetic methods produce dialogues that lack diversity and naturalness, failing to truly reflect user preferences. To address these issues, the paper proposes the LAPS (LLM-Augmented Personalized Self-Dialogue) method, which uses large language models (LLM) to assist human workers in generating personalized multi-session dialogue data. The LAPS method can: - **Improve data generation efficiency**: By using LLM to generate guiding information, it helps human workers generate high-quality dialogue data more quickly. - **Ensure dialogue diversity and naturalness**: Compared to fully synthetic methods, dialogues generated by LAPS are more natural and diverse. - **Extract and store user preferences**: After each dialogue session, user preferences are extracted from the dialogue and stored in a preference memory for use in subsequent sessions. Through these methods, LAPS can collect large-scale, multi-domain, multi-session dialogue data that includes real user preferences, thereby providing high-quality training data for future personalized dialogue systems.

Doing Personal LAPS: LLM-Augmented Dialogue Construction for Personalized Multi-Session Conversational Search

Dialogue Learning with Human-in-the-Loop.

Learning Retrieval Augmentation for Personalized Dialogue Generation

On the Way to LLM Personalization: Learning to Remember User Conversations

Hello Again! LLM-powered Personalized Agent for Long-term Dialogue

LUCID: LLM-Generated Utterances for Complex and Interesting Dialogues

Real or Robotic? Assessing Whether LLMs Accurately Simulate Qualities of Human Responses in Dialogue

Synthetic Dialogue Dataset Generation using LLM Agents

PersoBench: Benchmarking Personalized Response Generation in Large Language Models

LLM Roleplay: Simulating Human-Chatbot Interaction

Apollonion: Profile-centric Dialog Agent

Beyond ChatBots: ExploreLLM for Structured Thoughts and Personalized Model Responses

Simulating User Agents for Embodied Conversational-AI

Faithful Persona-based Conversational Dataset Generation with Large Language Models

Balancing Accuracy and Efficiency in Multi-Turn Intent Classification for LLM-Powered Dialog Systems in Production

On Overcoming Miscalibrated Conversational Priors in LLM-based Chatbots

User Interaction Patterns and Breakdowns in Conversing with LLM-Powered Voice Assistants

A Multi-LLM Orchestration Engine for Personalized, Context-Rich Assistance

LLM-based Smart Reply (LSR): Enhancing Collaborative Performance with ChatGPT-mediated Smart Reply System

Enhancing Pipeline-Based Conversational Agents with Large Language Models

IRLab@iKAT24: Learned Sparse Retrieval with Multi-aspect LLM Query Generation for Conversational Search