Predicting CTCF’s Cell Type-Specific Binding Sites in Human Genome

Lu Chai,Jie Gao,Zihan Li,Yunjie Wang,Junjie Liu,Yong Wang,Lirong Zhang,Hao Sun
DOI: https://doi.org/10.21203/rs.3.rs-5042361/v1
2024-01-01
Abstract:The CCCTC-binding factor (CTCF) is pivotal in orchestrating diverse biological functions across the human genome, yet the mechanisms driving its cell type-specific DNA binding affinity remain underexplored. Here, we collected ChIP-seq data from 67 cell lines in ENCODE, constructed a unique dataset of cell type-specific CTCF binding sites (CBS), and trained convolutional neural networks (CNN) to dissect the patterns of CTCF binding specificity. Our analysis reveals that transcription factors RAD21/SMC3 and chromatin accessibility are more predictive compared to sequence motifs and histone modifications. Integrating them together achieved AUC values consistently above 0.868, highlighting their utility in deciphering CTCF transcription factor binding dynamics. This study provides a deeper understanding of the regulatory functions of CTCF via machine learning framework.
What problem does this paper attempt to address?