ElectionSim: Massive Population Election Simulation Powered by Large Language Model Driven Agents

Xinnong Zhang,Jiayu Lin,Libo Sun,Weihong Qi,Yihang Yang,Yue Chen,Hanjia Lyu,Xinyi Mou,Siming Chen,Jiebo Luo,Xuanjing Huang,Shiping Tang,Zhongyu Wei
2024-10-28
Abstract:The massive population election simulation aims to model the preferences of specific groups in particular election scenarios. It has garnered significant attention for its potential to forecast real-world social trends. Traditional agent-based modeling (ABM) methods are constrained by their ability to incorporate complex individual background information and provide interactive prediction results. In this paper, we introduce ElectionSim, an innovative election simulation framework based on large language models, designed to support accurate voter simulations and customized distributions, together with an interactive platform to dialogue with simulated voters. We present a million-level voter pool sampled from social media platforms to support accurate individual simulation. We also introduce PPE, a poll-based presidential election benchmark to assess the performance of our framework under the U.S. presidential election scenario. Through extensive experiments and analyses, we demonstrate the effectiveness and robustness of our framework in U.S. presidential election simulations.
Computation and Language,Computers and Society,Human-Computer Interaction
What problem does this paper attempt to address?
This paper attempts to solve the following three main problems: 1. **How to achieve high - precision election simulation at the individual level?** - Traditional agent - based modeling (ABM) methods are difficult to integrate complex individual background information, resulting in insufficiently accurate simulations of individual behaviors. Although large language models (LLMs) can generate human - like behaviors, they lack sufficient personalized input data in large - scale election simulations and cannot fully capture the diversity of voter behaviors, motivations, and decision - making processes. This directly affects the accuracy of macro - level results. 2. **How to generate customized distributions consistent with real - world statistical data?** - Accurate election simulations require ensuring that simulated individuals can represent the diversity and distribution of the real - world population. Although random sampling can capture diversity, it is difficult to align with the real - world demographic distribution and is prone to bias. Therefore, a carefully planned sampling strategy needs to be designed to ensure that the distribution of simulated individuals is consistent with that of actual voters. 3. **How to systematically evaluate the performance of election simulations?** - Existing election simulation evaluation methods mainly focus on prediction accuracy, which is too single - minded to comprehensively evaluate all aspects of simulation results. In order to more comprehensively evaluate the effectiveness of election simulations, a multi - faceted evaluation method needs to be designed to provide a more comprehensive analysis. To solve these problems, the paper proposes the **ElectionSim** framework, which uses large - language - model - driven agents for large - scale election simulations. Specifically: - **ElectionSim**: By collecting a large amount of user data from social media platforms, a diverse voter pool of millions of users is constructed, and methods such as iterative proportional fitting (IPF) are used to ensure that the sample distribution is consistent with the real - world voter distribution. - **PPE Benchmark**: A poll - based presidential election (PPE) benchmark is introduced to evaluate the accuracy of simulation results. Experiments show that the performance of this framework in the simulation of the US presidential election is very close to the actual election results, accurately matching the actual results in the predictions of 46/51 states and 12/15 swing states. Through these methods, the ElectionSim framework not only improves the simulation accuracy at the individual level but also ensures a high degree of consistency between the simulation results and the real world and provides a systematic evaluation method.