DeepChIA-PET: Accurately predicting ChIA-PET from Hi-C and ChIP-seq with deep dilated networks

Tong Liu,Zheng Wang
DOI: https://doi.org/10.1371/journal.pcbi.1011307
2023-07-14
PLoS Computational Biology
Abstract:Chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) can capture genome-wide chromatin interactions mediated by a specific DNA-associated protein. The ChIA-PET experiments have been applied to explore the key roles of different protein factors in chromatin folding and transcription regulation. However, compared with widely available Hi-C and ChIP-seq data, there are not many ChIA-PET datasets available in the literature. A computational method for accurately predicting ChIA-PET interactions from Hi-C and ChIP-seq data is needed that can save the efforts of performing wet-lab experiments. Here we present DeepChIA-PET, a supervised deep learning approach that can accurately predict ChIA-PET interactions by learning the latent relationships between ChIA-PET and two widely used data types: Hi-C and ChIP-seq. We trained our deep models with CTCF-mediated ChIA-PET of GM12878 as ground truth, and the deep network contains 40 dilated residual convolutional blocks. We first showed that DeepChIA-PET with only Hi-C as input significantly outperforms Peakachu, another computational method for predicting ChIA-PET from Hi-C but using random forests. We next proved that adding ChIP-seq as one extra input does improve the classification performance of DeepChIA-PET, but Hi-C plays a more prominent role in DeepChIA-PET than ChIP-seq. Our evaluation results indicate that our learned models can accurately predict not only CTCF-mediated ChIA-ET in GM12878 and HeLa but also non-CTCF ChIA-PET interactions, including RNA polymerase II (RNAPII) ChIA-PET of GM12878, RAD21 ChIA-PET of GM12878, and RAD21 ChIA-PET of K562. In total, DeepChIA-PET is an accurate tool for predicting the ChIA-PET interactions mediated by various chromatin-associated proteins from different cell types. Various techniques have been widely used to model and investigate three-dimensional (3D) genomes, such as Hi-C, ChIA-PET, and HiChIP. Unlike Hi-C, ChIA-PET can capture genome-wide chromosomal contacts mediated by a predefined DNA-associated protein. Compared with an adequate resource of Hi-C data captured from different cell lines for various species, only a limited number of ChIA-PET data sets are publicly available. Here we developed a deep learning method using Hi-C and ChIP-seq data to predict ChIA-PET directly. Our evaluation results demonstrate that our learned models can successfully predict ChIA-PETs mediated by a different DNA-associated protein for a separate cell line. Our method is an effective alternative to obtaining ChIA-PET contacts from a wet lab.
biochemical research methods,mathematical & computational biology
What problem does this paper attempt to address?