Anastasia S. Lyulina,Zhiru Liu,Benjamin H. Good
Abstract:Recombination breaks down genetic linkage by reshuffling existing variants onto new genetic backgrounds. These dynamics are traditionally quantified by examining the correlations between alleles, and how they decay as a function of the recombination rate. However, the magnitudes of these correlations are strongly influenced by other evolutionary forces like natural selection and genetic drift, making it difficult to tease out the effects of recombination. Here we introduce a theoretical framework for analyzing an alternative family of statistics that measure the homoplasy produced by recombination. We derive analytical expressions that predict how these statistics depend on the rates of recombination and recurrent mutation, the strength of negative selection and genetic drift, and the present-day frequencies of the mutant alleles. We find that the degree of homoplasy can strongly depend on this frequency scale, which reflects the underlying timescales over which these mutations occurred. We show how these scaling properties can be used to isolate the effects of recombination, and discuss their implications for the rates of horizontal gene transfer in bacteria.
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to quantify the impact of recombination on the linkage equilibrium between rare mutations in the genome. Specifically, the author introduced a new theoretical framework to analyze the statistics of homoplasy generated by recombination and derived how these statistics depend on the recombination rate, the recurrent mutation rate, the intensity of negative selection, and the current frequency of mutant alleles. Through this method, the study aims to isolate the effect of recombination, so as to better understand biological processes such as the rate of horizontal gene transfer in bacteria.
### Background and Motivation of the Paper
Linkage Disequilibrium (LD) is an important concept in population genetics, which describes the statistical associations between alleles at different loci. These associations contain important information about the evolutionary forces within a population, such as recombination, natural selection, and genetic drift. However, due to the mutual influence of these evolutionary forces, it is very difficult to analyze the specific role of recombination. Most of the existing studies focus on the pairwise correlations of alleles at different positions in the genome, but the theoretical understanding of these correlations is still limited.
### Research Methods and Innovation Points
1. **Introduction of New Statistics**: The author proposed a new statistic \(\Lambda\) to measure the homoplasy generated by recombination. This statistic is defined as:
\[
\Lambda \equiv \frac{f_{ab} f_{Ab} f_{aB} f_{AB}}{f_A^2 (1 - f_A)^2 f_B^2 (1 - f_B)^2}
\]
where \(f_{ab}\), \(f_{Ab}\), \(f_{aB}\), \(f_{AB}\) represent the frequencies of four haplotypes respectively, and \(f_A\) and \(f_B\) represent the marginal frequencies of two loci.
2. **Theoretical Framework**: The author used a two - locus Wright - Fisher model to simulate the combined effects of mutation, recombination, negative selection, and genetic drift. Through the method of weighted moments, the author derived the analytical expressions of \(\Lambda\), which depend on the recombination rate, the recurrent mutation rate, the intensity of negative selection, and the frequency of mutant alleles.
3. **Conditional Averages**: The author considered two different weighting functions, one for focusing on the situation of rare alleles at two loci, and the other for focusing on the situation where one allele is rare and the other allele is at an intermediate frequency. These weighting functions are helpful for analyzing the homoplasy dynamics at different frequency scales.
### Main Findings
1. **Neutral Alleles**: In the absence of selection, \(\bar{\Lambda}_2\) only depends on the population - scaled recombination rate \(NR\) and the frequency scale \(f_0\). When \(NRf_0 \ll 1\), \(\bar{\Lambda}_2\approx 2NRf_0\); when \(NRf_0 \gg 1\), \(\bar{\Lambda}_2\approx 1\).
2. **The Influence of Negative Selection**: When negative selection exists, the behavior of \(\bar{\Lambda}_2\) depends on the relative strength of selection intensity and genetic drift. When \(Nsf_0 \ll 1\), the influence of selection on frequency is small, and \(\bar{\Lambda}_2\) is close to the neutral situation; when \(Nsf_0 \gg 1\), \(\bar{\Lambda}_2\) can be described by an effective selection cost \(s_e\), in the form of:
\[
\bar{\Lambda}_2 \approx \begin{cases}
\frac{R}{s_e} & \text{if } R \ll s_e \\
1 & \text{if } R \gg s_e
\end{cases}
\]
where \(s_e=\frac{60}{19}s\).
3. **The Influence of Recurrent Mutation**: When considering recurrent mutation, the author found \(\bar{\Lamb