Privacy-Preserving Microbiome Analysis Using Secure Computation

Justin Wagner,Joseph N. Paulson,Xiao-Shun Wang,Bobby Bhattacharjee,Hector Corrada Bravo
DOI: https://doi.org/10.1101/025999
2015-09-02
Abstract:Motivation: Developing targeted therapeutics and identifying biomarkers relies on large amounts of patient data. Beyond human DNA, researchers now investigate the DNA of micro-organisms inhabiting the human body. An individual's collection of microbial DNA consistently identifies that person and could be used to link a real-world identity to a sensitive attribute in a research dataset. Unfortunately, the current suite of DNA-specific privacy-preserving analysis tools does not meet the requirements for microbiome sequencing studies. Results: We augment an existing categorization of genomic-privacy attacks to incorporate microbiome sequencing and provide an implementation of metagenomic analyses using secure computation. Our implementation allows researchers to perform analysis over combined data without revealing individual patient attributes. We implement three metagenomic analyses and perform an evaluation on real datasets for comparative analysis. We use our implementation to simulate sharing data between four policy-domains and measure the increase in significant discoveries. Additionally, we describe an application of our implementation to form patient pools of data to allow drug companies to query against and compensate patients for the analysis. Availability: The software is freely available for download at: http://cbcb.umd.edu/\textapprox{}hcorrada/projects/secureseq.html
What problem does this paper attempt to address?