A sequence-based deep learning approach to predict CTCF-mediated chromatin loop

Hao Lv,Fu-Ying Dao,Hasan Zulfiqar,Wei Su,Hui Ding,Li Liu,Hao Lin
DOI: https://doi.org/10.1093/bib/bbab031
IF: 9.5
2021-02-25
Briefings in Bioinformatics
Abstract:Abstract Three-dimensional (3D) architecture of the chromosomes is of crucial importance for transcription regulation and DNA replication. Various high-throughput chromosome conformation capture-based methods have revealed that CTCF-mediated chromatin loops are a major component of 3D architecture. However, CTCF-mediated chromatin loops are cell type specific, and most chromatin interaction capture techniques are time-consuming and labor-intensive, which restricts their usage on a very large number of cell types. Genomic sequence-based computational models are sophisticated enough to capture important features of chromatin architecture and help to identify chromatin loops. In this work, we develop Deep-loop, a convolutional neural network model, to integrate k-tuple nucleotide frequency component, nucleotide pair spectrum encoding, position conservation, position scoring function and natural vector features for the prediction of chromatin loops. By a series of examination based on cross-validation, Deep-loop shows excellent performance in the identification of the chromatin loops from different cell types. The source code of Deep-loop is freely available at the repository https://github.com/linDing-group/Deep-loop.
biochemical research methods,mathematical & computational biology
What problem does this paper attempt to address?