Abstract:The unparalleled performance of closed-sourced ChatGPT has sparked efforts towards its democratization, with notable strides made by leveraging real user and ChatGPT dialogues, as evidenced by Vicuna. However, due to challenges in gathering dialogues involving human participation, current endeavors like Baize and UltraChat rely on ChatGPT conducting roleplay to simulate humans based on instructions, resulting in overdependence on seeds, diminished human-likeness, limited topic diversity, and an absence of genuine multi-round conversational dynamics. To address the above issues, we propose a paradigm to simulate human behavior better and explore the benefits of incorporating more human-like questions in multi-turn conversations. Specifically, we directly target human questions extracted from genuine human-machine conversations as a learning goal and provide a novel user simulator called `Socratic'. The experimental results show our response model, `PlatoLM', achieves SoTA performance among LLaMA-based 7B models in MT-Bench. Our findings further demonstrate that our method introduces highly human-like questioning patterns and rich topic structures, which can teach the response model better than previous works in multi-round conversations.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is how to more realistically simulate human behavior in multi - turn conversations to improve the performance of large - language models (LLMs) in dialogue tasks. Specifically, the author points out several limitations of current methods: 1. **Dependence on seed conversations**: Many existing methods rely on ChatGPT for static role - playing to simulate human conversations, which results in the conversation content being overly dependent on seed conversations and lacking diversity. 2. **Lack of real multi - turn conversation dynamics**: Since static simulation is difficult to capture real - life human conversation patterns, the generated conversations often lack the natural multi - turn interaction characteristics. 3. **Single - topic structure**: Existing simulation methods are difficult to produce rich topic structures, limiting the diversity and depth of conversations. To solve these problems, the author proposes a new paradigm by training a learnable user simulator (called "Socratic") that directly targets real - human questions for learning. This user simulator can more naturally engage in multi - turn conversations with system agents (such as ChatGPT), thereby generating a conversation dataset (called "SocraticChat") that is closer to real - life scenarios. Finally, a new response model (called "PlatoLM") is trained based on this dataset to improve its performance in multi - turn conversations. ### Main contributions 1. **Proposed an effective human - behavior - simulation paradigm**: By reversing the learning objective, from ChatGPT's answers to real - user questions, the simulator becomes more human - like. 2. **Provided multiple versions of multi - turn conversation datasets**: These datasets expand the scale and diversity of existing conversation datasets. 3. **Trained a new response model, PlatoLM**: With a small number of training samples, PlatoLM performs well in multiple benchmark tests, especially outperforming other models in multi - turn conversation tasks. 4. **Discovered that more human - like questioning patterns are helpful for teaching**: Compared to static role - playing, human - like questioning patterns in dynamic multi - turn conversations can better guide the learning of dialogue models. Through these improvements, the paper demonstrates how to improve the performance of large - language models in multi - turn conversations through more realistic conversation simulation.

PlatoLM: Teaching LLMs in Multi-Round Dialogue via a User Simulator

Learning through Dialogue Interactions by Asking Questions

Dialogue Learning with Human-in-the-Loop.

Uman-in-thel oop

Let the LLMs Talk: Simulating Human-to-Human Conversational QA via Zero-Shot LLM-to-LLM Interactions

LLM Roleplay: Simulating Human-Chatbot Interaction

Real or Robotic? Assessing Whether LLMs Accurately Simulate Qualities of Human Responses in Dialogue

User Simulation with Large Language Models for Evaluating Task-Oriented Dialogue

DuetSim: Building User Simulator with Dual Large Language Models for Task-Oriented Dialogues

Simulating User Agents for Embodied Conversational-AI

SPL: A Socratic Playground for Learning Powered by Large Language Model

Reliable LLM-based User Simulator for Task-Oriented Dialogue Systems

DiverseDialogue: A Methodology for Designing Chatbots with Human-Like Diversity

Multi-User Chat Assistant (MUCA): a Framework Using LLMs to Facilitate Group Conversations

PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable

Beyond ChatBots: ExploreLLM for Structured Thoughts and Personalized Model Responses

Response Generation for Cognitive Behavioral Therapy with Large Language Models: Comparative Study with Socratic Questioning

Large Language Model based Situational Dialogues for Second Language Learning

Character-LLM: A Trainable Agent for Role-Playing

SimulBench: Evaluating Language Models with Creative Simulation Tasks

Probing the Multi-turn Planning Capabilities of LLMs via 20 Question Games