Abstract:Large Language Models (LLMs) have revolutionized natural language processing but can exhibit biases and may generate toxic content. While alignment techniques like Reinforcement Learning from Human Feedback (RLHF) reduce these issues, their impact on creativity, defined as syntactic and semantic diversity, remains unexplored. We investigate the unintended consequences of RLHF on the creativity of LLMs through three experiments focusing on the Llama-2 series. Our findings reveal that aligned models exhibit lower entropy in token predictions, form distinct clusters in the embedding space, and gravitate towards "attractor states", indicating limited output diversity. Our findings have significant implications for marketers who rely on LLMs for creative tasks such as copywriting, ad creation, and customer persona generation. The trade-off between consistency and creativity in aligned models should be carefully considered when selecting the appropriate model for a given application. We also discuss the importance of prompt engineering in harnessing the creative potential of base models.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **After large - language models (LLMs) are aligned through reinforcement learning from human feedback (RLHF), will their creativity and output diversity be negatively affected?**
Specifically, the paper focuses on whether the RLHF process will cause LLMs to become too deterministic and single - minded when generating text, thus losing creativity and diversity. Creativity is defined here as the diversity of the text generated by the model in terms of syntax (lexical choice, sentence structure, etc.) and semantics (meaning, emotion, opinion, etc.).
### Research Background
Large - language models (LLMs) perform excellently in natural language processing, but they may generate biased or toxic content. To reduce these problems, researchers have developed techniques such as RLHF to align the models with human values and preferences. However, these alignment techniques may inadvertently reduce the model's creativity and output diversity, which poses a challenge for applications that require a high degree of diversity and innovation (such as creative tasks in marketing).
### Research Methods
The paper explores the impact of RLHF on the creativity of LLMs through three experiments:
1. **Experiment 1: Customer Profile and Product Review Generation**
- **Purpose**: To compare the diversity of LLMs before and after alignment when generating customer profiles and product reviews.
- **Method**: Use the base model and the aligned model to generate 100 unique customer profiles respectively, each profile containing attributes such as name, gender, age, nationality, ethnicity, and personality type. Each simulated customer will also write a review for a hypothetical product (a smart coffee machine).
- **Analysis**: Compare the diversity of customer profiles and reviews generated by the two models through methods such as word clouds, distribution maps, sentiment analysis, and embedding vector clustering.
2. **Experiment 2: Semantic - level Diversity**
- **Purpose**: To evaluate the semantic diversity of the base model and the aligned model when expressing the same historical fact (information about Grace Hopper).
- **Method**: Set the initial prompt "Grace Hopper was" and generate 200 outputs, calculate the embedding vectors using SBERT, and then visualize them through t - SNE dimensionality reduction.
- **Analysis**: Compare the semantic diversity of the outputs generated by the two models through t - SNE visualization and cosine similarity analysis.
3. **Experiment 3: Syntactic - level Diversity**
- **Purpose**: To study the differences in phrase - level probability distributions between the base model and the aligned model when generating text.
- **Method**: Set the initial prompt "Steve is the CEO of a startup company", generate a background story of 64 tokens, record each generated token and the probability distribution of its top five predicted tokens, and calculate Shannon entropy.
- **Analysis**: Compare the syntactic diversity of the two models when generating text by calculating the average entropy.
### Main Findings
- **Customer Profiles and Product Reviews**: The customer profiles generated by the aligned model show less diversity in terms of name, nationality, ethnicity, personality type, and age. The reviews generated by the aligned model are also more concentrated on positive emotions, and the sentence structures and contents are more similar.
- **Semantic Diversity**: The outputs of the aligned model form tighter clusters in the embedding space, indicating that the text generated by it is more semantically single - minded.
- **Semantic Diversity**: The token probability distribution of the aligned model when generating text is more skewed, resulting in a lack of syntactic diversity in the generated text.
### Conclusion
The paper reveals that the RLHF process may cause the aligned LLMs to become too deterministic and single - minded when generating text, thus sacrificing creativity and output diversity. This finding is of great significance for applications that require a high degree of diversity and innovation (such as marketing). Therefore, when choosing a model, it is necessary to balance the relationship between consistency and creativity.