Strong association between genomic 3D structure and CRISPR cleavage efficiency

Shaked Bergman,Tamir Tuller
DOI: https://doi.org/10.1371/journal.pcbi.1012214
2024-06-08
PLoS Computational Biology
Abstract:CRISPR is a gene editing technology which enables precise in-vivo genome editing; but its potential is hampered by its relatively low specificity and sensitivity. Improving CRISPR's on-target and off-target effects requires a better understanding of its mechanism and determinants. Here we demonstrate, for the first time, the chromosomal 3D spatial structure's association with CRISPR's cleavage efficiency, and its predictive capabilities. We used high-resolution Hi-C data to estimate the 3D distance between different regions in the human genome and utilized these spatial properties to generate 3D-based features, characterizing each region's density. We evaluated these features based on empirical, in-vivo CRISPR efficiency data and compared them to 425 features used in state-of-the-art models. The 3D features ranked in the top 13% of the features, and significantly improved the predictive power of LASSO and xgboost models trained with these features. The features indicated that sites with lower spatial density demonstrated higher efficiency. Understanding how CRISPR is affected by the 3D DNA structure provides insight into CRISPR's mechanism in general and improves our ability to correctly predict CRISPR's cleavage as well as design sgRNAs for therapeutic and scientific use. CRISPR is an efficient and precise genome editing technology; but our understanding of the factors influencing it, and our ability to accurately design and predict its efficiency at different target sites, are lacking. Current models mostly use the sequence of the target site, rather than incorporating additional information regarding its genomic location. Here we propose a new type of predictive feature, based on the 3D structure of the genome. We show that features estimating the spatial density of the target site correlate highly with CRISPR efficiency, compared to classic sequence-based features; and that adding these features to predictive models significantly improves their power. CRISPR efficiency was negatively correlated with 3D density, indicating CRISPR is more efficient in sparse regions–possibly due to easier access to the target site. Improving our ability to predict CRISPR action will allow us to further understand its mechanism and better utilize it in research and medicine.
biochemical research methods,mathematical & computational biology
What problem does this paper attempt to address?