A multi-glimpse deep learning architecture to estimate socioeconomic census metrics in the context of extreme scope variance

Dan Runfola,Anthony Stefanidis,Zhonghui Lv,Joseph O’Brien,Heather Baier
DOI: https://doi.org/10.1080/13658816.2024.2305636
2024-02-02
International Journal of Geographical Information Science
Abstract:Convolutional Neural Networks (CNNs) are leveraged for a wide range of satellite imagery information extraction tasks. However, for tasks which seek to estimate aggregated information across highly variable geographic extents, existing techniques are subject to critical limitations. We engage with a specific case study exploring this challenge: estimating census variables across 2358 Mexican municipalities, which range in scope from 2.21 km 2 ( ̃74,000 30 m pixels) to 72,417.9 km 2 (millions of pixels). Building on recent literature which has illustrated the capability of deep learning to extract socioeconomic information from satellite imagery, we specifically seek to establish baseline metrics of error that might be expected when estimating a range of census variables based on coarse-resolution (Landsat) satellite imagery alone. For each of 52 variables, we implement a multi-glimpse recurrent attention model, in which we parametrically determine subsets of each municipality to sample across iterative steps. Results of a five-fold validation indicate that nearly half of the tested variables (22) can be estimated with r 2 values greater than 0.75. Results suggest considerable promise for the use of satellite imagery to estimate socioeconomic factors in both historic time periods for which surveys were not conducted, as well as contemporary inaccessible regions.
geography, physical,computer science, information systems,information science & library science
What problem does this paper attempt to address?