ClOneHORT: Approaches for Improved Fidelity in Generative Models of Synthetic Genomes

Roland Laboulaye,Victor Borda,Shuo Chen,Kari E. North,Robert Kaplan,Timothy D. O'Connor
DOI: https://doi.org/10.1101/2024.06.25.600651
2024-06-29
Abstract:Deep generative models have the potential to overcome difficulties in sharing individual-level genomic data by producing synthetic genomes that preserve the genomic associations specific to a cohort while not violating the privacy of any individual cohort member. However, there is significant room for improvement in the fidelity and usability of existing synthetic genome approaches. We demonstrate that when combined with plentiful data and with population-specific selection criteria, deep generative models can produce synthetic genomes and cohorts that closely model the original populations. Our methods improve fidelity in the site-frequency spectra and linkage disequilibrium decay and yield synthetic genomes that can be substituted in downstream local ancestry inference analysis, recreating results with .91 to .94 accuracy.
Genomics
What problem does this paper attempt to address?