A Generalizable Deep Learning Framework for Inferring Fine-Scale Germline Mutation Rate Maps

Yiyuan Fang,Shuyi Deng,Cai Li
DOI: https://doi.org/10.1101/2021.10.25.465689
2021-01-01
Abstract:Germline mutation rates are essential for genetic and evolutionary analyses. Yet, estimating accurate fine-scale mutation rates across the genome is a great challenge, due to relatively few observed mutations and intricate relationships between predictors and mutation rates. Here, we present Mutation Rate Learner (MuRaL), a deep learning framework to predict mutation rates at the nucleotide level using only genomic sequences as input. Harnessing human germline variants for comprehensive assessment, we show that MuRaL achieves better predictive performance than current state-of-the-art methods. Moreover, MuRaL can build models with relatively few training mutations and a moderate number of sequenced individuals, and can leverage transfer learning to further reduce data and time demands. We apply MuRaL to produce genome-wide mutation rate maps for four representative species-Homo sapiens, Macaca mulatta, Drosophila melanogaster and Arabidopsis thaliana-demonstrating its high applicability. As an example, we use improved mutation rate estimates to stratify human genes into distinct groups that are enriched for different functions, and highlight that many developmental genes are subject to high mutational burden. The open-source software and generated mutation rate maps can greatly facilitate related research.
What problem does this paper attempt to address?