Spatial-Attention and Demographic-Augmented Generative Adversarial Imputation Network for Population Health Data Reconstruction

Yujie Feng,Jiangtao Wang,Yasha Wang,Xu Chu
DOI: https://doi.org/10.1109/TBDATA.2022.3227089
2023-01-01
IEEE Transactions on Big Data
Abstract:As a fundamental component of the public health system, population health monitoring plays an important role in health policy-shaping. However, due to the high-cost nature of traditional data collection approaches, many sparse-sampling-completion algorithms are proposed to solve this problem. Existing data-completion methods are usually based on adjacent-spatial correlations, but this correlation isn't sufficient to ensure accurate inference when prevalence data for its neighboring areas are also missing due to cost constraints. To tackle this problem, we propose a novel deep-learning-based prevalence inference model called Spatial-attention and Demographic-augmented Generative Adversarial Imputation Network (SDA-GAIN). SDA-GAIN can improve accuracy by learning novel "health semantic space similarities" between cross-space areas. The key insight of SDA-GAIN is that we use the Transformer-based model to learn healthy semantic similarities between areas, and use the GAN-based model to make a high-accuracy completion. We further introduce demographic data to augment the model's ability to learn a better health semantic representation through using CNN. Extensive experiments show that SDA-GAIN outperforms other state-of-the-art approaches at low sampling rates (lower than 30%) which has a significant benefit on saving sampling costs. Also by visualizing the health semantic similarity learned by SDA-GAIN, the results are very similar to the real situation.
What problem does this paper attempt to address?