Deep learning models to predict the editing efficiencies and outcomes of diverse base editors

Nahye Kim,Sungchul Choi,Sungjae Kim,Myungjae Song,Jung Hwa Seo,Seonwoo Min,Jinman Park,Sung-Rae Cho,Hyongbum Henry Kim
DOI: https://doi.org/10.1038/s41587-023-01792-x
Abstract:Applications of base editing are frequently restricted by the requirement for a protospacer adjacent motif (PAM), and selecting the optimal base editor (BE) and single-guide RNA pair (sgRNA) for a given target can be difficult. To select for BEs and sgRNAs without extensive experimental work, we systematically compared the editing windows, outcomes and preferred motifs for seven BEs, including two cytosine BEs, two adenine BEs and three C•G to G•C BEs at thousands of target sequences. We also evaluated nine Cas9 variants that recognize different PAM sequences and developed a deep learning model, DeepCas9variants, for predicting which variants function most efficiently at sites with a given target sequence. We then develop a computational model, DeepBE, that predicts editing efficiencies and outcomes of 63 BEs that were generated by incorporating nine Cas9 variants as nickase domains into the seven BE variants. The predicted median efficiencies of BEs with DeepBE-based design were 2.9- to 20-fold higher than those of rationally designed SpCas9-containing BEs.
What problem does this paper attempt to address?