Pan-UK Biobank GWAS improves discovery, analysis of genetic architecture, and resolution into ancestry-enriched effects

Konrad J Karczewski,Rahul Gupta,Masahiro Kanai,Wenhan Lu,Kristin Tsuo,Ying Wang,Raymond K Walters,Patrick Turley,Shawneequa Callier,Nirav Shah,Nikolas Baya,Duncan S Palmer,Jacqueline I Goldstein,Gopal Sarma,Matthew Solomonson,Nathan Cheng,Sam Bryant,Claire Churchhouse,Caroline M Cusick,Timothy Poterba,John Compitello,Daniel King,Wei Zhou,Cotton Seed,Hilary K Finucane,Mark J Daly,Benjamin M Neale,Elizabeth G Atkinson,Alicia R Martin
DOI: https://doi.org/10.1101/2024.03.13.24303864
2024-10-01
Abstract:Large biobanks, such as the UK Biobank (UKB), enable massive phenome by genome-wide association studies that elucidate genetic etiology of complex traits. However, individuals from diverse genetic ancestry groups are often excluded from association analyses due to concerns about population structure introducing false positive associations. Here, we generate mixed model associations and meta-analyses across genetic ancestry groups, inclusive of a larger fraction of the UKB than previous efforts, to produce freely-available summary statistics for 7,266 traits. We build a quality control and analysis framework informed by genetic architecture. Overall, we identify 14,676 significant loci (p < 5 x 10-8) in the meta-analysis that were not found in the EUR genetic ancestry group alone, including novel associations for example between CAMK2D and triglycerides. We also highlight associations from ancestry-enriched variation, including a known pleiotropic missense variant in G6PD associated with several biomarker traits. We release these results publicly alongside FAQs that describe caveats for interpretation of results, enhancing available resources for interpretation of risk variants across diverse populations.
Genetic and Genomic Medicine
What problem does this paper attempt to address?
The paper aims to address the following key issues: 1. **Enhancing the discovery power of Genome-Wide Association Studies (GWAS)**: By integrating data from different genetic background populations, conducting mixed model association analyses and meta-analyses to identify more genetic loci associated with complex traits. This includes new loci that were not discovered in European genetic background populations. 2. **Deciphering genetic structure**: Through the analysis of multi-ancestry populations, better decipher the genetic structure of complex traits, revealing specific variations and their biological significance in different ancestral groups. 3. **Reducing false positive results caused by population stratification**: By optimizing the analytical framework, reduce false positive associations introduced by population stratification (i.e., systematic differences between different genetic background populations), improving the reliability and accuracy of the study. 4. **Increasing research on non-European background populations**: In previous GWAS studies, non-European background populations were often excluded, leading to insufficient genetic discoveries in these groups. This study increases the coverage of genetic research on these populations by including a larger proportion of non-European background individuals, enhancing the generalizability and applicability of the research findings. 5. **Providing high-quality public data resources**: Release freely available summary statistics and detailed explanatory guides to help researchers better understand and utilize these data, promoting the interpretation and application of risk variants across populations. Specifically, this study analyzed 7,266 phenotypes in the UK Biobank, using mixed model association analyses and meta-analyses methods for multi-ancestry populations, and discovered 14,676 significant loci (p<5x10^-8) that were not found in European genetic background populations. These findings not only increase the understanding of the genetic basis of complex traits but also provide valuable resources for future research.