A Simple Algorithm for Estimating Distribution Parameters from $n$-Dimensional Randomized Binary Responses

Staal A. Vinterbo
DOI: https://doi.org/10.48550/arXiv.1803.03981
2018-07-13
Abstract:Randomized response is attractive for privacy preserving data collection because the provided privacy can be quantified by means such as differential privacy. However, recovering and analyzing statistics involving multiple dependent randomized binary attributes can be difficult, posing a significant barrier to use. In this work, we address this problem by identifying and analyzing a family of response randomizers that change each binary attribute independently with the same probability. Modes of Google's Rappor randomizer as well as applications of two well-known classical randomized response methods, Warner's original method and Simmons' unrelated question method, belong to this family. We show that randomizers in this family transform multinomial distribution parameters by an iterated Kronecker product of an invertible and bisymmetric $2 \times 2$ matrix. This allows us to present a simple and efficient algorithm for obtaining unbiased maximum likelihood parameter estimates for $k$-way marginals from randomized responses and provide theoretical bounds on the statistical efficiency achieved. We also describe the efficiency - differential privacy tradeoff. Importantly, both randomization of responses and the estimation algorithm are simple to implement, an aspect critical to technologies for privacy protection and security.
Cryptography and Security
What problem does this paper attempt to address?