Abstract:Large language models~(LLMs) have demonstrated impressive performance in various applications, among which role-playing language agents (RPLAs) have engaged a broad user base. Now, there is a growing demand for RPLAs that represent Key Opinion Leaders (KOLs), \ie, Internet celebrities who shape the trends and opinions in their domains. However, research in this line remains underexplored. In this paper, we hence introduce MINDECHO, a comprehensive framework for the development and evaluation of KOL RPLAs. MINDECHO collects KOL data from Internet video transcripts in various professional fields, and synthesizes their conversations leveraging GPT-4. Then, the conversations and the transcripts are used for individualized model training and inference-time retrieval, respectively. Our evaluation covers both general dimensions (\ie, knowledge and tones) and fan-centric dimensions for KOLs. Extensive experiments validate the effectiveness of MINDECHO in developing and evaluating KOL RPLAs.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to construct and evaluate role - playing language agents (RPLAs) representing key opinion leaders (KOLs). Specifically, the paper focuses on the following points: 1. **Constructing high - quality KOL RPLAs**: - Existing role - playing research mainly focuses on characters in fictional settings (such as Gandalf in "The Lord of the Rings"), and the data sources for these characters are mostly established novels, encyclopedias, and scripts. However, the construction of RPLAs based on real people, especially KOLs, has not been fully explored. - KOLs are influential figures in specific professional fields, and they establish authority and credibility by sharing professional content. Constructing KOL - based RPLAs can provide users with expert insights and answers, simulating the knowledge provided by these leaders. 2. **Coping with real - world challenges**: - Compared with fictional characters, KOL RPLAs need to handle more complex real - world environments, including new terms, popular events, and online communication methods. - KOLs have many personal opinions, which must be reflected in the RPLA and distinguished from the opinions of the LLM itself. - KOLs' knowledge is very intensive, so more knowledge - intensive tasks need to be designed to evaluate their capabilities. 3. **Ensuring data quality and authenticity**: - Video data rarely appears in the training corpora of existing LLMs, while KOL videos are usually in the first - person perspective, which is closer to real - human interactions. - The paper proposes a method to reduce the model's dependence on opinions embedded in parameters by identifying anti - common - sense opinions and combining these opinions with the constructed data, encouraging the model to trust external knowledge more. 4. **Systematic evaluation framework**: - It provides a comprehensive evaluation framework to evaluate the capabilities of KOL RPLAs from three perspectives: professional knowledge, tone characteristics, and user - centered simulated interactions. - The basic performance is evaluated by multiple - choice questions to quantitatively reflect the model's capabilities; the user - centered performance is evaluated by simulating the interactions between new and old fans and the RPLA. In summary, this paper aims to fill the research gap in constructing and evaluating role - playing language agents based on real people, especially KOLs, and proposes a comprehensive framework named MINDECHO to achieve this goal.

MINDECHO: Role-Playing Language Agents for Key Opinion Leaders

Capturing Minds, Not Just Words: Enhancing Role-Playing Language Models with Personality-Indicative Data

Role-RL: Online Long-Context Processing with Role Reinforcement Learning for Distinct LLMs in Their Optimal Roles

PrefCLM: Enhancing Preference-based Reinforcement Learning with Crowdsourced Large Language Models

RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models

Fine-Grained Behavior Simulation with Role-Playing Large Language Model on Social Media

Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent

RLPF: Reinforcement Learning from Prediction Feedback for User Summarization with LLMs

Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment

The Perfect Blend: Redefining RLHF with Mixture of Judges

Self-Boosting Large Language Models with Synthetic Preference Data

BEYOND DIALOGUE: A Profile-Dialogue Alignment Framework Towards General Role-Playing Language Model

NaRLE: Natural Language Models using Reinforcement Learning with Emotion Feedback

LIRE: listwise reward enhancement for preference alignment

Orchestrating LLMs with Different Personalizations

RoleEval: A Bilingual Role Evaluation Benchmark for Large Language Models

On the Decision-Making Abilities in Role-Playing using Large Language Models

RoleCraft-GLM: Advancing Personalized Role-Playing in Large Language Models

CharacterEval: A Chinese Benchmark for Role-Playing Conversational Agent Evaluation

Thinking Before Speaking: A Role-playing Model with Mindset

Orca: Enhancing Role-Playing Abilities of Large Language Models by Integrating Personality Traits