BEYOND DIALOGUE: A Profile-Dialogue Alignment Framework Towards General Role-Playing Language Model

Yeyong Yu,Runsheng Yu,Haojie Wei,Zhanqiu Zhang,Quan Qian
2024-08-29
Abstract:The rapid advancement of large language models (LLMs) has revolutionized role-playing, enabling the development of general role-playing models. However, current role-playing training has two significant issues: (I) Using a predefined role profile to prompt dialogue training for specific scenarios usually leads to inconsistencies and even conflicts between the dialogue and the profile, resulting in training biases. (II) The model learns to imitate the role based solely on the profile, neglecting profile-dialogue alignment at the sentence level. In this work, we propose a simple yet effective framework called BEYOND DIALOGUE, designed to overcome these hurdles. This framework innovatively introduces "beyond dialogue" tasks to align dialogue with profile traits based on each specific scenario, thereby eliminating biases during training. Furthermore, by adopting an innovative prompting mechanism that generates reasoning outcomes for training, the framework allows the model to achieve fine-grained alignment between profile and dialogue at the sentence level. The aforementioned methods are fully automated and low-cost. Additionally, the integration of automated dialogue and objective evaluation methods forms a comprehensive framework, paving the way for general role-playing. Experimental results demonstrate that our model excels in adhering to and reflecting various dimensions of role profiles, outperforming most proprietary general and specialized role-playing baselines. All code and datasets are available at <a class="link-external link-https" href="https://github.com/yuyouyu32/BeyondDialogue" rel="external noopener nofollow">this https URL</a>.
Computation and Language,Human-Computer Interaction
What problem does this paper attempt to address?
This paper attempts to solve the problem of deviation between pre - defined role profiles and dialogues in role - playing training, as well as the current lack of fine - grained alignment between role profiles and dialogues in the training process of the model. Specifically: 1. **Deviation between Profiles and Dialogues**: In current role - playing dialogue training tasks, pre - defined role profiles are usually used to prompt dialogue training in specific scenarios. However, this practice often leads to inconsistency or even conflict between the dialogue and the profile, thus introducing training deviation. For example, for the Hermione character in "Harry Potter", her speaking style is defined as "bookish", "encouraging", "meddlesome", but in the dialogue of a specific scenario, only some of these traits may be reflected, causing the model to be unable to fully follow the role profile. 2. **Lack of Fine - grained Alignment**: Current models mainly imitate roles through role profiles, but ignore the alignment between profiles and dialogues at the sentence level. This makes the model only able to learn a fuzzy mapping relationship and unable to understand how specific traits are specifically manifested in the dialogue. To solve these problems, the author proposes a new framework - **BEYOND DIALOGUE**, which aligns the dialogue with the role profile by introducing "beyond - dialogue" tasks and uses an innovative prompting mechanism to generate inference results, achieving fine - grained alignment between the profile and the dialogue at the sentence level. This framework can not only eliminate the deviation in the training process, but also improve the performance of the model in role - playing.