Decouple-and-Sample: Protecting sensitive information in task agnostic data release

Abhishek Singh,Ethan Garza,Ayush Chopra,Praneeth Vepakomma,Vivek Sharma,Ramesh Raskar
DOI: https://doi.org/10.48550/arXiv.2203.13204
2022-03-18
Abstract:We propose sanitizer, a framework for secure and task-agnostic data release. While releasing datasets continues to make a big impact in various applications of computer vision, its impact is mostly realized when data sharing is not inhibited by privacy concerns. We alleviate these concerns by sanitizing datasets in a two-stage process. First, we introduce a global decoupling stage for decomposing raw data into sensitive and non-sensitive latent representations. Secondly, we design a local sampling stage to synthetically generate sensitive information with differential privacy and merge it with non-sensitive latent features to create a useful representation while preserving the privacy. This newly formed latent information is a task-agnostic representation of the original dataset with anonymized sensitive information. While most algorithms sanitize data in a task-dependent manner, a few task-agnostic sanitization techniques sanitize data by censoring sensitive information. In this work, we show that a better privacy-utility trade-off is achieved if sensitive information can be synthesized privately. We validate the effectiveness of the sanitizer by outperforming state-of-the-art baselines on the existing benchmark tasks and demonstrating tasks that are not possible using existing techniques.
Cryptography and Security,Computer Vision and Pattern Recognition,Computers and Society,Machine Learning
What problem does this paper attempt to address?