A fusion framework of deep learning and machine learning for predicting sgRNA cleavage efficiency
Yu Liu,Rui Fan,Jingkun Yi,Qinghua Cui,Chunmei Cui
DOI: https://doi.org/10.1016/j.compbiomed.2023.107476
IF: 7.7
2023-09-08
Computers in Biology and Medicine
Abstract:CRISPR/Cas9 system is a powerful tool for genome editing. Numerous studies have shown that sgRNAs can strongly affect the efficiency of editing. However, it is still not clear what rules should be followed for designing sgRNA with high cleavage efficiency. At present, several machine learning or deep learning methods have been developed to predict the cleavage efficiency of sgRNAs, however, the prediction accuracy of these tools is still not satisfactory. Here we propose a fusion framework of deep learning and machine learning, which first deals with the primary sequence and secondary structure features of the sgRNAs using both convolutional neural network (CNN) and recurrent neural network (RNN), and then uses the features extracted by the deep neural network to train a conventional machine learning model with LGBM. As a result, the new approach overwhelmed previous methods. The Spearman's correlation coefficient between predicted and measured sgRNA cleavage efficiency of our model (0.917) is improved by over 5% compared with the most advanced method (0.865), and the mean square error reduces from 7.89 × 10 −3 to 4.75 × 10 −3 . Finally, we developed an online tool, CRISep ( http://www.cuilab.cn/CRISep ), to evaluate the availability of sgRNAs based on our models.
engineering, biomedical,computer science, interdisciplinary applications,mathematical & computational biology,biology