Efficient and Accurate Memorable Conversation Model using DPO based on sLLM

Youngkyung Seo,Yoonseok Heo,Jun-Seok Koh,Du-Seong Chang

2024-08-27

Abstract:In multi-session dialog system, it is essential to continuously update the memory as the session progresses. Simply accumulating memory can make it difficult to focus on the content of the conversation for inference due to the limited input sentence size. Therefore, efficient and accurate conversation model that is capable of managing memory to reflect the conversation history continuously is necessary. This paper presents a conversation model that efficiently manages memory as sessions progress and incorporates this into the model to reflect the conversation history accurately with 3 methodologies: SFT, DPO and DPO with SFT model. Our model using DPO algorithm shows an improvement about 0.0591 of BERTScore in memory accuracy, and the rate of responses reflecting the memory increased as well. Also, response generation performance enhanced about 4.292 in fluency, 3.935 in coherence, and 2.896 in consistency. This paper describes a training method that yields better performance than models with more than twice the parameter size, even when the model size is smaller. Thus, our model demonstrates efficiency not only in terms of accuracy but also in resource utilization.

Computation and Language,Artificial Intelligence

What problem does this paper attempt to address?

The paper aims to address the issue of continuously updating memory in multi-turn dialogue systems, particularly in the context of small language models (sLLMs). Specifically, the paper focuses on the following points: 1. **Efficient Memory Management**: As the dialogue progresses, it is necessary to manage and update memory effectively. Simply accumulating memory can make the inference process difficult due to the limited length of input sentences. 2. **Accurate Reflection of Dialogue History**: An efficient and accurate dialogue model is proposed that can continuously reflect the dialogue history. Through three methods (SFT, DPO, and a combination of SFT and DPO), effective management and integration of memory are achieved. 3. **Resource Utilization Efficiency**: Despite the small scale of the model, significant improvements in memory accuracy can be achieved through specific training methods (such as the DPO algorithm), and enhancements in fluency, coherence, and consistency are also observed. In summary, the paper mainly addresses how to achieve an efficient multi-turn dialogue system using small language models under resource-constrained conditions, ensuring that the system can accurately reference past dialogue information.

Efficient and Accurate Memorable Conversation Model using DPO based on sLLM

Dual Dynamic Memory Network for End-to-End Multi-turn Task-oriented Dialog Systems.

Learning to Memorize Entailment and Discourse Relations for Persona-Consistent Dialogues

Memory-Augmented Dialogue Management for Task-Oriented Dialogue Systems

Efficient Dialogue State Tracking by Selectively Overwriting Memory

Mixed-Session Conversation with Egocentric Memory

MemBench: Towards Real-world Evaluation of Memory-Augmented Dialogue Systems

sDPO: Don't Use Your Data All at Once

Ever-Evolving Memory by Blending and Refining the Past

Aging Memories Generate More Fluent Dialogue Responses with Memory Augmented Neural Networks

Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conversations

GraphMemDialog: Optimizing End-to-End Task-Oriented Dialog Systems Using Graph Memory Networks

Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models

LDM$^2$: A Large Decision Model Imitating Human Cognition with Dynamic Memory Enhancement

Long Time No See! Open-Domain Conversation with Long-Term Persona Memory

Deep context modeling for multi-turn response selection in dialogue systems

Retrieve & Memorize: Dialog Policy Learning with Multi-Action Memory

A Novel Linguistic-Aware Memory Structure for Enhancing the Response Generation

MemoryBank: Enhancing Large Language Models with Long-Term Memory

Evaluating Very Long-Term Conversational Memory of LLM Agents