Phenotype projections accelerate biobank-scale GWAS

Michael Zietz,Undina Gisladottir,Kathleen LaRow Brown,Nicholas P. Tatonetti
DOI: https://doi.org/10.1101/2023.11.20.567948
2024-04-26
Abstract:Understanding the genetic basis of complex disease is a critical research goal due to the immense, worldwide burden of these diseases. Pan-biobank genome-wide association studies (GWAS) provide a powerful resource in complex disease genetics, generating shareable summary statistics on thousands of phenotypes. Biobank-scale GWAS have two notable limitations: they are resource-intensive to compute and do not inform about hand-crafted phenotype definitions, which are often more relevant to study. Here we present Indirect GWAS, a summary-statistic-based method that addresses these limitations. Indirect GWAS computes GWAS statistics for any phenotype defined as a linear combination of other phenotypes. Our method can reduce runtime by an order of magnitude for large pan-biobank GWAS, and it enables ultra-rapid (roughly one minute) GWAS on hand-crafted phenotype definitions using only summary statistics. Overall, this method advances complex disease research by facilitating more accessible and cost-effective genetic studies using large observational data.
Genetics
What problem does this paper attempt to address?