Probabilistic Couplings for Probabilistic Reasoning

Justin Hsu
DOI: https://doi.org/10.48550/arXiv.1710.09951
2017-11-01
Abstract:This thesis explores proofs by coupling from the perspective of formal verification. Long employed in probability theory and theoretical computer science, these proofs construct couplings between the output distributions of two probabilistic processes. Couplings can imply various guarantees comparing two runs of a probabilistic computation. We first show that proofs in the program logic pRHL describe couplings. We formalize couplings that establish various probabilistic properties, including distribution equivalence, convergence, and stochastic domination. Then we give a proofs-as-programs interpretation: a coupling proof encodes a probabilistic product program, whose properties imply relational properties of the original programs. We design the logic xpRHL to construct the product, with extensions to model shift coupling and path coupling. We then propose an approximate version of probabilistic coupling and a corresponding proof technique---proof by approximate coupling---inspired by the logic apRHL, a version of pRHL for building approximate liftings. Drawing on ideas from existing privacy proofs, we extend apRHL with novel proof rules for constructing new approximate couplings. We give an approximate coupling proof of privacy for the Sparse Vector mechanism, a well-known algorithm from the privacy literature whose privacy proof is notoriously subtle, and produce the first formalized proof of privacy for Sparse Vector in apRHL. Finally, we propose several more sophisticated constructions for approximate couplings: a principle for showing accuracy-dependent privacy, a generalization of the advanced composition theorem, and an optimal approximate coupling relating two subsets. We also show equivalences between our approximate couplings and other existing definitions. These ingredients support the first formalized proof of privacy for the Between Thresholds mechanism.
Logic in Computer Science,Data Structures and Algorithms,Programming Languages
What problem does this paper attempt to address?