Leveraging Differential Privacy in Geospatial Analyses of Standardized Healthcare Data

Daniel R Harris
DOI: https://doi.org/10.1109/bigdata50022.2020.9378390
Abstract:We present a collection of geodatabase functions which expedite utilizing differential privacy for privacy-aware geospatial analysis of healthcare data. The healthcare domain has a long history of standardization and research communities have developed open-source common data models to support the larger goals of interoperability, reproducibility, and data sharing; these models also standardize geospatial patient data. However, patient privacy laws and institutional regulations complicate geospatial analyses and dissemination of research findings due to protective restrictions in how data and results are shared. This results in infrastructures with great abilities to organize and store healthcare data, yet which lack the innate ability to produce shareable results that preserve privacy and conform to regulatory requirements. Differential privacy is a model for performing privacy-preserving analytics. We detail our process and findings in inserting an open-source library for differential privacy into a workflow for leveraging a geodatabase for geocoding and analyzing geospatial data stored as part of the Observational Medical Outcomes Partnership (OMOP) common data model. We pilot this process using an open big data repository of addresses.
What problem does this paper attempt to address?