Data-driven agent-based modeling in computational social science

Jan Lorenz
DOI: https://doi.org/10.4324/9781003024583-11
2021-11-10
Abstract:With agent-based models, computational social scientists want to explain the emergence of social phenomena, such as ethnic and social segregation, opinion polarization, the rise of mass protests, economic and political cycles, or the ups and downs of epidemic spreading of infectious diseases. While modeling, researchers may experiment with theories and assumptions about the behavior of individuals and link them to the societal level through computer simulation of repeated social interaction of many artificial individuals. Similarly, researchers play with institutional settings modifying the context in which agents interact. Such agent-based models can be seen as a tool to explore multi-player games where actors’ rationality is bounded. In both senses, agent-based modeling is intrinsically theory driven. The strength of agent-based modeling is the quantitative causal understanding of emergence through complex nonlinear dynamics on the macro-level. Therefore, methods of validation with empirical data are less easily specified and more diverse than in variable-based regression models. In particular, the nature of the data–theory link is closely related to the main research purpose and can deviate from validation. Nevertheless, agent-based models are to explain real-world phenomena, and thus a useful model should have a reflection in empirical data and the other way round. This chapter is about how agent-based modeling can be data-driven. This includes (1) the consideration of data structure in the model building process, (2) a parallel data exploration searching and quantifying “stylized facts” to later use for validation, (3) the iteration of model revisions to increase its replicative validity, and (4) the calibration of model parameters with data for forecasting or counterfactual simulations. These aspects of data-driven agent-based modeling will be exemplified by two examples about segregation and polarization. This chapter is about how agent-based modeling can be data-driven. This includes: the consideration of data structure in the model building process, a parallel data exploration searching and quantifying “stylized facts” to later use for validation, the iteration of model revisions to increase its replicative validity, and the calibration of model parameters with data for forecasting or counterfactual simulations. Agent-based modeling and simulation are core methods in computational social science, besides making sense of large amounts of human-generated data and studying and shaping how digitization changes human societies. The segregation model’s fundamental insight is that ethnic homophily can trigger heavily segregated towns even when similarity preferences are mild. For societies in the real world, macroscopic transitions at critical thresholds in agent-based models imply that small changes in its population or environmental conditions can trigger substantial qualitative changes. The data-driven way to derive stylized facts is exploratory data analysis, which is the core part of the data science process.
What problem does this paper attempt to address?