LLMs generate structurally realistic social networks but overestimate political homophily

Serina Chang,Alicja Chaszczewicz,Emma Wang,Maya Josifovska,Emma Pierson,Jure Leskovec
2024-08-29
Abstract:Generating social networks is essential for many applications, such as epidemic modeling and social simulations. Prior approaches either involve deep learning models, which require many observed networks for training, or stylized models, which are limited in their realism and flexibility. In contrast, LLMs offer the potential for zero-shot and flexible network generation. However, two key questions are: (1) are LLM's generated networks realistic, and (2) what are risks of bias, given the importance of demographics in forming social ties? To answer these questions, we develop three prompting methods for network generation and compare the generated networks to real social networks. We find that more realistic networks are generated with "local" methods, where the LLM constructs relations for one persona at a time, compared to "global" methods that construct the entire network at once. We also find that the generated networks match real networks on many characteristics, including density, clustering, community structure, and degree. However, we find that LLMs emphasize political homophily over all other types of homophily and overestimate political homophily relative to real-world measures.
Computers and Society,Artificial Intelligence,Social and Information Networks
What problem does this paper attempt to address?
The paper primarily focuses on studying the performance of large language models (LLMs) in generating social networks and exploring the bias issues within. Specifically, the paper attempts to answer the following two core questions: 1. **Do the generated social networks have realism?** Researchers developed three different prompting methods to generate social networks and compared these generated networks with real social networks to evaluate whether their structural characteristics (such as density, clustering coefficient, connectivity, etc.) are close to those in real-world networks. 2. **Is there bias in the generation process?** Special attention is given to whether the social networks generated by LLMs overly emphasize political homogeneity, and a comparative analysis is conducted with other types of homogeneity (such as gender, age, race, etc.). Through experiments, it was found that using the "local" method (i.e., LLM constructs relationships from the perspective of one person at a time) generates network structures closer to reality compared to the "global" method (constructing the entire network at once). However, LLMs tend to overly emphasize political homogeneity when generating networks, which does not align with real-world measurements. Moreover, even when non-demographic characteristics such as interests are included, this phenomenon of overemphasizing political homogeneity still persists. Overall, although LLMs show potential in generating social networks, the study also reveals the challenges faced in integrating demographic data.