A machine learning enhanced EMS mutagenesis probability map for efficient identification of causal mutations in Caenorhabditis elegans

Zhengyang Guo,Shimin Wang,Yang Wang,Zi Wang,Guangshuo Ou
DOI: https://doi.org/10.1371/journal.pgen.1011377
IF: 4.5
2024-08-28
PLoS Genetics
Abstract:Chemical mutagenesis-driven forward genetic screens are pivotal in unveiling gene functions, yet identifying causal mutations behind phenotypes remains laborious, hindering their high-throughput application. Here, we reveal a non-uniform mutation rate caused by Ethyl Methane Sulfonate (EMS) mutagenesis in the C . elegans genome, indicating that mutation frequency is influenced by proximate sequence context and chromatin status. Leveraging these factors, we developed a machine learning enhanced pipeline to create a comprehensive EMS mutagenesis probability map for the C . elegans genome. This map operates on the principle that causative mutations are enriched in genetic screens targeting specific phenotypes among random mutations. Applying this map to Whole Genome Sequencing (WGS) data of genetic suppressors that rescue a C . elegans ciliary kinesin mutant, we successfully pinpointed causal mutations without generating recombinant inbred lines. This method can be adapted in other species, offering a scalable approach for identifying causal genes and revitalizing the effectiveness of forward genetic screens. Exploring gene functions through chemical mutagenesis-driven genetic screens is pivotal, yet the cumbersome task of identifying causative mutations remains a bottleneck, limiting their high-throughput potential. In this investigation, we uncovered a non-uniform mutation pattern induced by Ethyl Methane Sulfonate (EMS) mutagenesis in the C . elegans genome, highlighting the influence of proximate sequence context and chromatin status on mutation frequency. Leveraging these insights, we engineered a machine learning enhanced pipeline to construct a comprehensive EMS mutagenesis probability map for the C . elegans genome. This map operates on the principle that causative mutations are selectively enriched in genetic screens targeting specific phenotypes amid the backdrop of random mutations. Applying this mapping tool to Whole Genome Sequencing (WGS) data derived from genetic suppressors rescuing a C . elegans ciliary kinesin mutant, we achieved precise identification of causal mutations without resorting to the conventional generation of recombinant inbred lines. Our work not only advances understanding of mutation dynamics but also revitalizes the efficacy of forward genetic screens, contributing to the refinement of genetic exploration methodologies with implications for various organisms.
genetics & heredity
What problem does this paper attempt to address?