Detecting synthetic population bias using a spatially-oriented framework and independent validation data

Jessica Embury,Atsushi Nara,Sergio Rey,Ming-Hsiang Tsou,Sahar Ghanipoor Machiani
DOI: https://doi.org/10.1080/13658816.2024.2358399
2024-05-25
International Journal of Geographical Information Science
Abstract:Models of human mobility can be broadly applied to find solutions addressing diverse topics such as public health policy, transportation management, emergency management, and urban development. However, many mobility models require individual-level data that is limited in availability and accessibility. Synthetic populations are commonly used as the foundation for mobility models because they provide detailed individual-level data representing the different types and characteristics of people in a study area. Thorough evaluation of synthetic populations is required to detect data biases before the prejudices are transferred to subsequent applications. Although synthetic populations are commonly used for modeling mobility, they are conventionally validated by their sociodemographic characteristics, rather than mobility attributes. Mobility microdata provides an opportunity to independently/externally validate the mobility attributes of synthetic populations. This study demonstrates a spatially-oriented data validation framework and independent data validation to assess the mobility attributes of two synthetic populations at different spatial granularities. Validation using independent data (SafeGraph) and the validation framework replicated the spatial distribution of errors detected using source data (LODES) and total absolute error. Spatial clusters of error exposed the locations of underrepresented and overrepresented communities. This information can guide bias mitigation efforts to generate a more representative synthetic population.
geography, physical,computer science, information systems,information science & library science
What problem does this paper attempt to address?