popDMS infers mutation effects from deep mutational scanning data

Zhenchen Hong,John P. Barton
DOI: https://doi.org/10.1101/2024.01.29.577759
2024-01-31
Abstract:Deep mutational scanning (DMS) experiments provide a powerful method to measure the functional effects of genetic mutations at massive scales. However, the data generated from these experiments can be difficult to analyze, with significant variation between experimental replicates. To overcome this challenge, we developed popDMS, a computational method based on population genetics theory, to infer the functional effects of mutations from DMS data. Through extensive tests, we found that the functional effects of single mutations and epistasis inferred by popDMS are highly consistent across replicates, comparing favorably with existing methods. Our approach is flexible and can be widely applied to DMS data that includes multiple time points, multiple replicates, and different experimental conditions.
Bioinformatics
What problem does this paper attempt to address?
The main focus of this paper is to address the challenges in analyzing data from Deep Mutational Scanning (DMS) experiments. DMS experiments measure the impact of gene mutations on functionality in a large scale, but the analysis of the data is difficult, especially due to significant variations between different experimental replicates. To overcome this issue, the researchers developed a new method called popDMS, which is based on population genetics theory and is used to infer the functional effects of mutations. popDMS treats the selection rounds in the experiment as reproductive rounds in a natural population and quantifies the survival advantage or disadvantage of each mutation using selection coefficients. This method utilizes Bayesian inference to determine the selection coefficients that best explain the data, and it can handle data from multiple time points, replicates, and different experimental conditions. In the paper, popDMS is compared to other existing methods, and the results demonstrate higher consistency between replicate experiments when inferring the effects of individual mutations and epistatic interactions. Additionally, popDMS performs better in dealing with noise and inferring the uncertainty of variation effects. By analyzing 25 DMS datasets, popDMS demonstrates its advantage in improving the correlation of variation effects between different replicate experiments, with an average increase in R² value of 0.36. Compared to existing methods, the inferred mutation effects from popDMS are more consistent, and the comparison with amino acid variation frequencies in natural viral populations also shows good consistency. Overall, popDMS provides an effective and reliable tool for inferring mutation effects from DMS data. Its evolutionary-based theoretical foundation gives it an advantage in analyzing complex data.