Personalized Text Generation with Fine-Grained Linguistic Control

Bashar Alhafni,Vivek Kulkarni,Dhruv Kumar,Vipul Raheja

2024-02-07

Abstract:As the text generation capabilities of large language models become increasingly prominent, recent studies have focused on controlling particular aspects of the generated text to make it more personalized. However, most research on controllable text generation focuses on controlling the content or modeling specific high-level/coarse-grained attributes that reflect authors' writing styles, such as formality, domain, or sentiment. In this paper, we focus on controlling fine-grained attributes spanning multiple linguistic dimensions, such as lexical and syntactic attributes. We introduce a novel benchmark to train generative models and evaluate their ability to generate personalized text based on multiple fine-grained linguistic attributes. We systematically investigate the performance of various large language models on our benchmark and draw insights from the factors that impact their performance. We make our code, data, and pretrained models publicly available.

Computer Science

What problem does this paper attempt to address?

The problem this paper attempts to address is: how to make the text generated by large language models more personalized and capable of precisely controlling multiple fine-grained linguistic attributes. Specifically, the authors focus on generating text that not only reflects the author's writing style but also achieves fine-grained control over multiple linguistic dimensions such as vocabulary and syntax. To achieve this goal, the authors propose a new benchmark for evaluating the performance of generative models in producing personalized text. This benchmark covers multiple datasets, including blogs, movie reviews, and Amazon product reviews, involving authors from various fields. Through this benchmark, the authors systematically analyze the performance of different large language models in generating personalized text and explore the factors that influence model performance. Additionally, the authors propose an improved method by introducing prefix techniques in the generative model, incorporating the author's linguistic attribute vectors as part of the input to enhance the model's performance in multi-attribute personalized style transfer tasks. These improvements enable the model to better understand and generate text that conforms to specific linguistic attributes. In summary, the main contributions of this paper include: 1. **Benchmark**: Proposing a benchmark for personalized text generation. 2. **Linguistic Feature Analysis**: Conducting a detailed linguistic feature analysis of the author's writing style, covering multiple dimensions such as vocabulary, syntax, and rhetoric. 3. **Improved Model**: Evaluating the effectiveness of various baseline models and proposing improvements that significantly enhance the model's performance in multi-attribute personalized style transfer tasks.

Personalized Text Generation with Fine-Grained Linguistic Control

Teach LLMs to Personalize -- An Approach inspired by Writing Education

LongLaMP: A Benchmark for Personalized Long-form Text Generation

Controllable Text Generation for Open-Domain Creativity and Fairness

Powerful and Flexible: Personalized Text-to-Image Generation via Reinforcement Learning

Research on user granularity-level personalized social text generation technology

Automated Evaluation of Personalized Text Generation using Large Language Models

Controllable Text Generation with Language Constraints

Controllable Text Generation for Large Language Models: A Survey

Fingerprinting Fine-tuned Language Models in the Wild

Toward Controlled Generation of Text

Text is All You Need: Personalizing ASR Models using Controllable Speech Synthesis

Fine-grained Controllable Text Generation through In-context Learning with Feedback

Personalized Language Modeling from Personalized Human Feedback

Learning to Generate Text in Arbitrary Writing Styles

Enhancing Text Generation in Joint NLG/NLU Learning Through Curriculum Learning, Semi-Supervised Training, and Advanced Optimization Techniques

Learning to Rewrite Prompts for Personalized Text Generation

Automatic Conditional Generation of Personalized Social Media Short Texts

Synthetic Data Generation in Low-Resource Settings via Fine-Tuning of Large Language Models

Unlocking Anticipatory Text Generation: A Constrained Approach for Large Language Models Decoding