Synthetic Census Data Generation via Multidimensional Multiset Sum

Cynthia Dwork,Kristjan Greenewald,Manish Raghavan
2024-04-16
Abstract:The US Decennial Census provides valuable data for both research and policy purposes. Census data are subject to a variety of disclosure avoidance techniques prior to release in order to preserve respondent confidentiality. While many are interested in studying the impacts of disclosure avoidance methods on downstream analyses, particularly with the introduction of differential privacy in the 2020 Decennial Census, these efforts are limited by a critical lack of data: The underlying "microdata," which serve as necessary input to disclosure avoidance methods, are kept confidential.
Computers and Society,Cryptography and Security,Data Structures and Algorithms
What problem does this paper attempt to address?