GWASBrewer: An R Package for Simulating Realistic GWAS Summary Statistics

Jean Morrison
DOI: https://doi.org/10.1101/2024.04.16.589571
2024-04-20
Abstract:Many statistical genetics analysis methods make use of GWAS summary statistics. Best statistical practice requires evaluating these methods in simulations against a known truth. Ideally, these simulations should be as realistic as possible. However, simulating summary statistics by first simulating individual genotype and phenotype data is extremely computationally demanding, especially when large sample sizes or many traits are required. We present , an open source R package for direct simulation of GWAS summary statistics. We show that statistics simulated by have the same distribution as statistics generated from individual level data, and can be produced at a fraction of the computational expense. Additionally, can simulate standard error estimates, something that is typically not done when sampling summary statistics directly. is highly flexible, allowing the user to simulate data for multiple traits connected by causal effects and with complex distributions of effect sizes. We demonstrate example uses of for evaluating Mendelian randomization, polygenic risk score, and heritability estimation methods.
Genetics
What problem does this paper attempt to address?