Abstract:Large Language Models (LLMs) have exhibited remarkable proficiency in comprehending and generating natural language. On the other hand, personalized LLM response generation holds the potential to offer substantial benefits for individuals in critical areas such as medical. Existing research has explored memory-augmented methods to prompt the LLM with pre-stored user-specific knowledge for personalized response generation in terms of new queries. We contend that such paradigm is unable to perceive fine-granularity information. In this study, we propose a novel \textbf{M}emory-\textbf{i}njected approach using parameter-efficient fine-tuning (PEFT) and along with a Bayesian Optimisation searching strategy to achieve \textbf{L}LM \textbf{P}ersonalization(\textbf{MiLP}).

What problem does this paper attempt to address?

This paper mainly discusses how to effectively integrate user information into large-scale language models (LLMs) to achieve personalized response generation. Existing methods have limitations, such as text prompting being limited by the long context window of LLMs, and memory augmentation methods may not capture fine-grained information. Inspired by the biological memory mechanism, a parameterized memory injection method (MiLP) is proposed, combining parameter efficiency fine-tuning (PEFT) and Bayesian optimization search strategy to achieve personalization of LLMs. MiLP utilizes a feed-forward layer (FFL) in neural networks to simulate the memory mechanism of the real world and store and activate user information. It is inserted into the FFL of LLMs through the LoRA module and uses Bayesian optimization to determine the optimal configuration for storing and activating different memories. The paper also points out that different memories have different sensitivity to parameter budget and injection layer position, so multiple LoRA modules and high-dimensional multi-objective Bayesian optimization are needed to determine the optimal configuration. Experimental results show that MiLP significantly improves performance compared to baseline methods (including text prompting, memory augmentation, and user embedding methods) on three datasets, verifying its effectiveness and superiority. In addition, the paper conducts quality research and ablation studies, demonstrating the necessity of the MiLP component, as well as the advantages of combining memory injection and instruction fine-tuning. Future work may explore larger user bases and larger-scale LLMs, as well as improve the reasoning ability to understand user-specific needs.

Personalized LLM Response Generation with Parameterized Memory Injection

LLM-based Medical Assistant Personalization with Short- and Long-Term Memory Coordination

LLMs + Persona-Plug = Personalized LLMs

On the Way to LLM Personalization: Learning to Remember User Conversations

MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory

Persona-DB: Efficient Large Language Model Personalization for Response Prediction with Collaborative Data Refinement

PMG : Personalized Multimodal Generation with Large Language Models

Few-shot Personalization of LLMs with Mis-aligned Responses

Democratizing Large Language Models via Personalized Parameter-Efficient Fine-tuning

PEARL: Personalizing Large Language Model Writing Assistants with Generation-Calibrated Retrievers

Personalized Large Language Model Assistant with Evolving Conditional Memory

Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts

Personalized Large Language Models

Orchestrating LLMs with Different Personalizations

LDM$^2$: A Large Decision Model Imitating Human Cognition with Dynamic Memory Enhancement

Teach LLMs to Personalize -- An Approach inspired by Writing Education

MEMORYLLM: Towards Self-Updatable Large Language Models

Augmented Large Language Models with Parametric Knowledge Guiding