CRISPR-M: Predicting sgRNA off-target effect using a multi-view deep learning network
Jialiang Sun,Jun Guo,Jian Liu
DOI: https://doi.org/10.1371/journal.pcbi.1011972
2024-03-15
PLoS Computational Biology
Abstract:Using the CRISPR-Cas9 system to perform base substitutions at the target site is a typical technique for genome editing with the potential for applications in gene therapy and agricultural productivity. When the CRISPR-Cas9 system uses guide RNA to direct the Cas9 endonuclease to the target site, it may misdirect it to a potential off-target site, resulting in an unintended genome editing. Although several computational methods have been proposed to predict off-target effects, there is still room for improvement in the off-target effect prediction capability. In this paper, we present an effective approach called CRISPR-M with a new encoding scheme and a novel multi-view deep learning model to predict the sgRNA off-target effects for target sites containing indels and mismatches. CRISPR-M takes advantage of convolutional neural networks and bidirectional long short-term memory recurrent neural networks to construct a three-branch network towards multi-views. Compared with existing methods, CRISPR-M demonstrates significant performance advantages running on real-world datasets. Furthermore, experimental analysis of CRISPR-M under multiple metrics reveals its capability to extract features and validates its superiority on sgRNA off-target effect predictions. Genome editing using the CRISPR-Cas9 system, particularly base substitutions directed by guide RNA, holds immense potential for applications in gene therapy and agricultural productivity. However, the risk of unintended off-target effects poses a challenge, as misdirection of the Cas9 endonuclease can lead to unintended genome alterations. While computational methods exist for predicting off-target effects, there remains a need for encoding methods with more representation space and deep learning models with generalization capability and the adaptability. This paper introduces CRISPR-M, an innovative approach addressing the limitations of existing methods in predicting off-target effects, especially for target sites with indels and mismatches. CRISPR-M employs a novel encoding scheme and a multi-view deep learning model, combining convolutional neural networks and bidirectional long short-term memory recurrent neural networks. The three-branch network structure enhances the prediction accuracy by considering multiple perspectives. Compared with previous representative methods, CRISPR-M exhibits remarkable performance advantages when applied to real-world datasets. The experimental evaluation of CRISPR-M, assessed by various metrics such as ROC, PRC, GC content and melting temperature, demonstrates its ability to extract meaningful features and establishes its superiority in predicting off-target effects of sgRNA.
biochemical research methods,mathematical & computational biology