Inferring polygenic negative selection underlying an individual trait as a distribution of fitness effects (DFEs) from GWAS summary statistics

Alexander T. Xue,Yi-fei Huang,Adam Siepel
DOI: https://doi.org/10.1101/2024.07.29.601707
2024-08-02
Abstract:There has been rising interest in exploiting data from genome-wide association studies (GWAS) to detect a genetic signature of natural selection acting on a given phenotype. However, current approaches are unable to directly estimate the distribution of fitness effects (DFE), an established property in population genetics that can elucidate genomic architecture pertaining to a particular focal trait. To this end, we introduce ASSESS, an inferential method that exploits the Poisson Random Field (PRF) to model selection coefficients from genome-wide allele count data, while jointly conditioning GWAS summary statistics on a latent distribution of phenotypic effect sizes. This probabilistic model is unified under the assumption of an explicit relationship between fitness and trait effect to yield a DFE. To gauge the performance of ASSESS, we enlisted various simulation experiments that covered a range of usage cases and model misspecifications, which revealed accurate recovery of the underlying selection signal. As a further proof-of-concept, ASSESS was applied to an array of publicly available human trait data, whereby we replicated previously published empirical findings from an alternative methodology. These demonstrations illustrate the potential of ASSESS to satisfy an increasing need for powerful yet convenient population genomic inference from GWAS summary statistics.
Evolutionary Biology
What problem does this paper attempt to address?
The main objective of this paper is to propose a new method to address a key issue in current genomics research: how to directly estimate the Distribution of Fitness Effects (DFE) under natural selection associated with specific complex traits from summary statistics of Genome-Wide Association Studies (GWAS). Specifically, the paper attempts to address the following aspects: 1. **Direct Estimation of DFE**: Existing methods typically can only indirectly detect signs of natural selection or estimate proxy attributes related to fitness, but cannot directly estimate DFE. Therefore, the authors developed a method called ASSESS, aimed at filling this gap. 2. **Utilizing GWAS Summary Data**: ASSESS is designed to handle summary statistics from GWAS rather than individual-level genotype and phenotype data, which improves computational efficiency, convenience, and privacy protection. 3. **Addressing Polygenic Traits**: Many complex traits are formed by the combined effects of small effects from multiple loci. ASSESS addresses this challenge by integrating effect size information across the entire genome. 4. **Validating the Method's Effectiveness**: The paper validates the performance of ASSESS under different conditions through a series of simulation experiments and demonstrates its potential application on real datasets. In summary, the core issue this paper attempts to solve is the development of a new method capable of directly inferring the DFE associated with specific complex traits from GWAS summary statistics, and validating its effectiveness and practicality.