HiCArch: A Deep Learning-based Hi-C Data Predictor

Xiao Zheng,Jinghua Wang,Chaochen Wang
DOI: https://doi.org/10.1101/2021.11.26.470146
2021-01-01
Abstract:AbstractHi-C sequencing analysis is one of the most popular methods to study three-dimensional (3D) genome structures, which affect the gene expression and other cellular activities by allowing distal regulations in spatial proximity. Hi-C sequencing analysis enhances understanding of chromatin functionality. However, due to the high cost of Hi-C sequencing, the publicly available Hi-C data of high resolutions (such as 10kb) are limited in only a few cell types. In this paper we present HiCArch, a light-weight deep neural network that predicts Hi-C contact matrices from 11 common 1D epigenomic features. HiCArch identifies topological associated domains (TADs) of 10kb resolution within the distance of 10Mb. HiCArch obtains train Pearson correlation score at 0.9123 and test Pearson correlation score at 0.9195 when trained on K562 cell line. which are significantly higher than previous approaches, such as HiC-Reg[1], Akita[2], DeepC[3], and Epiphany[4].
What problem does this paper attempt to address?