Recent Trends in Personalized Dialogue Generation: A Review of Datasets, Methodologies, and Evaluations

Yi-Pei Chen,Noriki Nishida,Hideki Nakayama,Yuji Matsumoto
2024-05-28
Abstract:Enhancing user engagement through personalization in conversational agents has gained significance, especially with the advent of large language models that generate fluent responses. Personalized dialogue generation, however, is multifaceted and varies in its definition -- ranging from instilling a persona in the agent to capturing users' explicit and implicit cues. This paper seeks to systemically survey the recent landscape of personalized dialogue generation, including the datasets employed, methodologies developed, and evaluation metrics applied. Covering 22 datasets, we highlight benchmark datasets and newer ones enriched with additional features. We further analyze 17 seminal works from top conferences between 2021-2023 and identify five distinct types of problems. We also shed light on recent progress by LLMs in personalized dialogue generation. Our evaluation section offers a comprehensive summary of assessment facets and metrics utilized in these works. In conclusion, we discuss prevailing challenges and envision prospect directions for future research in personalized dialogue generation.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problems that this paper attempts to solve are multiple challenges in personalized dialogue generation, including issues in datasets, methodologies, and evaluation. Specifically: 1. **Datasets**: Current datasets have problems with size, quality, and diversity. The paper explores the characteristics of different datasets, especially the forms of Persona Representation (such as descriptive sentences, key - value pair attributes, user IDs, and comment histories), as well as the biases of datasets in terms of domains and languages. 2. **Methodologies**: The paper analyzes 17 top - conference papers published from 2021 to 2023 and identifies five different research directions: - **Consistency and Coherence**: Ensure that the generated responses are in line with the given persona settings and are consistent with the dialogue context. - **Person - Context Balance**: Decide when to pay more attention to the context and when to incorporate more personalized information. - **Relevant Person Selection**: Select the most relevant parts from the given person information to generate natural and engaging responses. - **Unknown Person Modeling**: When person information is not clearly given, how to extract or implicitly model personalized information from the dialogue history. - **Data Scarcity**: How to solve the problem of insufficient data through methods such as data augmentation, especially when dealing with long - tail problems and unseen person information. 3. **Evaluation**: The paper summarizes various aspects and metrics for evaluating personalized dialogue generation, including fluency, diversity, coherence, and the degree of personalization. These evaluation metrics are helpful for comprehensively measuring the performance of the model. In conclusion, this paper aims to systematically review the latest progress in the field of personalized dialogue generation, covering datasets, methodologies, and evaluation criteria, and discusses the main challenges in existing research and future research directions.