Beacon Reconstruction Attack: Reconstruction of genomes in genomic data-sharing beacons using summary statistics

Kousar Saleem,A. Ercument Cicek,Sinem Sav
DOI: https://doi.org/10.1101/2024.12.10.627379
2024-12-11
Abstract:Genomic data sharing beacon protocol, developed by the Global Alliance for Genomics and Health (GA4GH), offers a privacy-preserving mechanism for querying genomic datasets while restricting direct data access. Despite their design, beacons remain vulnerable to privacy attacks. This study introduces a novel privacy vulnerability of the protocol: One can reconstruct large portions of the genomes of all beacon participants by only using the summary statistics reported by the protocol. We introduce a novel optimization-based algorithm that leverages beacon responses and single nucleotide polymorphism (SNP) correlations for reconstruction. By optimizing for the SNP correlations and allele frequencies, the proposed approach achieves genome reconstruction with a substantially higher F1-score 70\% compared to baseline methods 45\% on beacons generated using individuals from the HapMap and OpenSNP datasets. Our findings reveal critical vulnerabilities in beacon protocol, underscoring the need for enhanced privacy-preserving mechanisms to protect genomic data. Our implementation is available at https://github.com/ASAP-Bilkent/Beacon-Reconstruction-Attack
Bioinformatics
What problem does this paper attempt to address?