AI Text-to-Behavior: A Study In Steerability

David Noever,Sam Hyams
DOI: https://doi.org/10.48550/arXiv.2308.07326
2023-08-08
Abstract:The research explores the steerability of Large Language Models (LLMs), particularly OpenAI's ChatGPT iterations. By employing a behavioral psychology framework called OCEAN (Openness, Conscientiousness, Extroversion, Agreeableness, Neuroticism), we quantitatively gauged the model's responsiveness to tailored prompts. When asked to generate text mimicking an extroverted personality, OCEAN scored the language alignment to that behavioral trait. In our analysis, while "openness" presented linguistic ambiguity, "conscientiousness" and "neuroticism" were distinctly evoked in the OCEAN framework, with "extroversion" and "agreeableness" showcasing a notable overlap yet distinct separation from other traits. Our findings underscore GPT's versatility and ability to discern and adapt to nuanced instructions. Furthermore, historical figure simulations highlighted the LLM's capacity to internalize and project instructible personas, precisely replicating their philosophies and dialogic styles. However, the rapid advancements in LLM capabilities and the opaque nature of some training techniques make metric proposals degrade rapidly. Our research emphasizes a quantitative role to describe steerability in LLMs, presenting both its promise and areas for further refinement in aligning its progress to human intentions.
Artificial Intelligence,Computation and Language,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the **steerability** problem of large language models (LLM). Specifically, researchers have explored how to guide or control large - language models such as OpenAI's ChatGPT through specific prompts, enabling them to generate texts that conform to specific emotions or intellectual styles. The paper mentions that although the steerability of LLM has received some attention, there is currently a lack of a standardized measurement method to evaluate the steerability among different models. Therefore, the main objective of the research is to develop a set of quantitative indicators to evaluate the LLM's ability to respond to user prompts, especially in terms of simulating personality traits and having conversations with historical figures. To achieve this goal, the research adopts the **OCEAN model** (Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism) as a behavioral psychology framework and tests whether the model can generate texts according to the expected behavioral characteristics through customized prompts. In addition, the research also designs experiments on simulating conversations with historical figures to examine the model's ability to maintain role consistency in complex situations and generate conversations that are in line with the figures' backgrounds. These experiments not only help to understand the steerability of LLM but also provide new directions for future research, especially in terms of improving the model's cultural sensitivity, ethics, and global relevance.