DLoopCaller: A deep learning approach for predicting genome-wide chromatin loops by integrating accessible chromatin landscapes

Siguo Wang,Qinhu Zhang,Ying He,Zhen Cui,Zhenghao Guo,Kyungsook Han,De-Shuang Huang
DOI: https://doi.org/10.1371/journal.pcbi.1010572
2022-10-09
PLoS Computational Biology
Abstract:In recent years, major advances have been made in various chromosome conformation capture technologies to further satisfy the needs of researchers for high-quality, high-resolution contact interactions. Discriminating the loops from genome-wide contact interactions is crucial for dissecting three-dimensional(3D) genome structure and function. Here, we present a deep learning method to predict genome-wide chromatin loops, called DLoopCaller, by combining accessible chromatin landscapes and raw Hi-C contact maps. Some available orthogonal data ChIA-PET/HiChIP and Capture Hi-C were used to generate positive samples with a wider contact matrix which provides the possibility to find more potential genome-wide chromatin loops. The experimental results demonstrate that DLoopCaller effectively improves the accuracy of predicting genome-wide chromatin loops compared to the state-of-the-art method Peakachu. Moreover, compared to two of most popular loop callers, such as HiCCUPS and Fit-Hi-C, DLoopCaller identifies some unique interactions. We conclude that a combination of chromatin landscapes on the one-dimensional genome contributes to understanding the 3D genome organization, and the identified chromatin loops reveal cell-type specificity and transcription factor motif co-enrichment across different cell lines and species. The emergence of chromosome conformation capture technologies has provided researchers with the opportunity to understand the role of three-dimensional genome structure in regulating gene expression and cell functions. Although significant progress has been made in studying the basic functional units (called chromatin loops) that directly regulate gene expression, but still have limitations on how to adequately extract features from the contact maps and rationally utilize multi-omics data. In this work, we effectively combine accessible chromatin landscapes and raw Hi-C contact maps data based on a deep learning framework to identify genome-wide chromatin loops. Besides, we use some available orthogonal data ChIA-PET/HiChIP and Capture Hi-C were used to generate training samples. We demonstrate the performance of our proposed method to identify some unique chromatin loops with high confidence. Moreover, the identified chromatin loops further reveal cell-type specificity and transcription factor motif co-enrichment across different cell lines and species, which may help us understand the mechanism of tissue-specific gene expression and transcriptional regulation.
biochemical research methods,mathematical & computational biology
What problem does this paper attempt to address?