ge-CRISPR - An integrated pipeline for the prediction and analysis of sgRNAs genome editing efficiency for CRISPR/Cas system

Karambir Kaur,Amit Kumar Gupta,Akanksha Rajput,Manoj Kumar
DOI: https://doi.org/10.1038/srep30870
IF: 4.6
2016-09-01
Scientific Reports
Abstract:Genome editing by sgRNA a component of CRISPR/Cas system emerged as a preferred technology for genome editing in recent years. However, activity and stability of sgRNA in genome targeting is greatly influenced by its sequence features. In this endeavor, a few prediction tools have been developed to design effective sgRNAs but these methods have their own limitations. Therefore, we have developed “ge-CRISPR” using high throughput data for the prediction and analysis of sgRNAs genome editing efficiency. Predictive models were employed using SVM for developing pipeline-1 (classification) and pipeline-2 (regression) using 2090 and 4139 experimentally verified sgRNAs respectively from Homo sapiens, Mus musculus, Danio rerio and Xenopus tropicalis. During 10-fold cross validation we have achieved accuracy and Matthew’s correlation coefficient of 87.70% and 0.75 for pipeline-1 on training dataset (T1840) while it performed equally well on independent dataset (V250). In pipeline-2 we attained Pearson correlation coefficient of 0.68 and 0.69 using best models on training (T3169) and independent dataset (V520) correspondingly. ge-CRISPR (http://bioinfo.imtech.res.in/manojk/gecrispr/) for a given genomic region will identify potent sgRNAs, their qualitative as well as quantitative efficiencies along with potential off-targets. It will be useful to scientific community engaged in CRISPR research and therapeutics development.
multidisciplinary sciences
What problem does this paper attempt to address?