Abstract:Creating human-like large language model (LLM) agents is crucial for faithful social simulation. Having LLMs role-play based on demographic information sometimes improves human likeness but often does not. This study assessed whether LLM alignment with human behavior can be improved by integrating information from empirically-derived human belief networks. Using data from a human survey, we estimated a belief network encompassing 64 topics loading on nine non-overlapping latent factors. We then seeded LLM-based agents with an opinion on one topic, and assessed the alignment of its expressed opinions on remaining test topics with corresponding human data. Role-playing based on demographic information alone did not align LLM and human opinions, but seeding the agent with a single belief greatly improved alignment for topics related in the belief network, and not for topics outside the network. These results suggest a novel path for human-LLM belief alignment in work seeking to simulate and understand patterns of belief distributions in society.

What problem does this paper attempt to address?

The core problem that this paper attempts to solve is: How to improve the accuracy of large - language models (LLMs) in simulating human beliefs and attitudes. Specifically, the researchers explored whether integrating information from human - belief - network based on empirical research can align the behavior of LLMs with that of real humans more effectively than simply relying on demographic information. ### Problem Background Currently, LLMs can perform role - playing according to specific demographic characteristics, and sometimes this can produce seemingly very realistic human - like behaviors. However, this role - playing based on demographic information does not always accurately reflect the real human beliefs and attitudes. For example, when asking an LLM to simulate people with different political stances to answer questions about the unemployment rate, the LLM may show human - like partisan biases, but this effect is not stable and difficult to predict. ### Research Objectives To improve the human - likeness of LLMs, the researchers proposed a new method: guiding the behavior of LLMs by introducing belief networks extracted from human survey data. Specifically, they used factor analysis to construct a belief network containing 64 topics and 9 latent factors. Then, they tested the performance of LLMs under the following conditions: 1. **No Demographic Information**: The LLM does not use any demographic information for role - playing. 2. **Only Demographic Information**: The LLM uses demographic information for role - playing. 3. **Demographic Information plus Beliefs in Specific Topics**: On the basis of demographic information, the LLM is given a belief in a specific topic. ### Experimental Design The researchers first constructed a belief network through factor analysis and divided these beliefs into nine different categories. Then, they created multiple LLM agents (i.e., "digital twins") and evaluated the performance of these agents under different conditions. Specific steps include: - **Initializing LLM Agents**: Provide demographic information and/or beliefs in specific topics through system messages. - **Querying Opinions**: Ask the LLM agents for their opinions on a series of topics. - **Evaluating the Degree of Alignment**: Calculate the difference between the opinions generated by the LLM agents and the real human opinions, using the mean absolute error (MAE) as an evaluation metric. ### Main Findings 1. **Limited Effect of Only Demographic Information**: Simply relying on demographic information does not significantly improve the consistency between LLMs and real human opinions. 2. **Introducing Beliefs in Specific Topics Significantly Improves Alignment**: When an LLM is given a belief in a specific topic, its performance on related topics improves significantly, especially those topics that belong to the same belief network as the initial belief. 3. **Combining Demographic Information and Beliefs in Specific Topics Has the Best Effect**: Using both demographic information and beliefs in specific topics can achieve the highest degree of alignment. ### Conclusions This study shows that by introducing information from human - belief - network based on empirical research, the accuracy of LLMs in simulating human beliefs and attitudes can be significantly improved. This method not only helps to better understand the pattern of belief distribution in human society, but also provides new ideas for developing more realistic social - simulation tools in the future. ### Formula Summary During the experiment, the researchers used the following formula to evaluate the performance of LLM agents: \[ \text{MAE}_{\text{test}}=\frac{1}{|X_{\text{test}}|}\sum_{x\in X_{\text{test}}}|o_{i,x}-o'_{i,x}| \] where: - \( o_{i,x} \) is the real opinion of the \( i \) - th human respondent on topic \( x \). - \( o'_{i,x} \) is the generated opinion of the corresponding LLM agent on the same topic \( x \). - \( X_{\text{test}} \) is the set of test topics. In addition, to measure the relative gain brought by introducing belief - network information, the researchers also calculated the following formula: \[ \text{Relative Gain (\%)}=\left(\frac{\text{MAE}_{\text{test}}(\text{Baseline})}{\text{MAE}_{\text{test}}(\text{New})}- 1\right)\times100 \]

Beyond Demographics: Aligning Role-playing LLM-based Agents Using Human Belief Networks

Simulating Opinion Dynamics with Networks of LLM-based Agents

Can Large Language Model Agents Simulate Human Trust Behavior?

Aligning Language Models to User Opinions

Aligning Large Language Models with Human Opinions through Persona Selection and Value--Belief--Norm Reasoning

Using LLMs to Model the Beliefs and Preferences of Targeted Populations

FairMindSim: Alignment of Behavior, Emotion, and Belief in Humans and LLM Agents Amid Ethical Dilemmas

Self-Alignment of Large Language Models via Monopolylogue-based Social Scene Simulation

Modeling Human Subjectivity in LLMs Using Explicit and Implicit Human Factors in Personas

Social Science Meets LLMs: How Reliable Are Large Language Models in Social Simulations?

How Far Are LLMs from Believable AI? A Benchmark for Evaluating the Believability of Human Behavior Simulation

Agentic Society: Merging skeleton from real world and texture from Large Language Model

Designing LLM-Agents with Personalities: A Psychometric Approach

MetaAgents: Simulating Interactions of Human Behaviors for LLM-based Task-oriented Coordination via Collaborative Generative Agents

Benchmarking Bias in Large Language Models during Role-Playing

Can Machines Think Like Humans? A Behavioral Evaluation of LLM-Agents in Dictator Games

Instigating Cooperation among LLM Agents Using Adaptive Information Modulation

Do LLM Agents Exhibit Social Behavior?

Is this the real life? Is this just fantasy? The Misleading Success of Simulating Social Interactions With LLMs

A Survey on Human-Centric LLMs

Aligning LLMs with Individual Preferences via Interaction