Extracting Multi-Way Chromatin Contacts from Hi-C Data

Lei Liu,Bokai Zhang,Changbong Hyeon
DOI: https://doi.org/10.1371/journal.pcbi.1009669
2021-01-01
Abstract:There is a growing realization that multi-way chromatin contacts formed in chromosome structures are fundamental units of gene regulation. However, due to the paucity and complexity of such contacts, it is challenging to detect and identify them using experiments. Based on an assumption that chromosome structures can be mapped onto a network of Gaussian polymer, here we derive analytic expressions for n-body contact probabilities (n > 2) among chromatin loci based on pairwise genomic contact frequencies available in Hi-C, and show that multi-way contact probability maps can in principle be extracted from Hi-C. The three-body (triplet) contact probabilities, calculated from our theory, are in good correlation with those from measurements including Tri-C, MC-4C and SPRITE. Maps of multi-way chromatin contacts calculated from our analytic expressions can not only complement experimental measurements, but also can offer better understanding of the related issues, such as cell-line dependent assemblies of multiple genes and enhancers to chromatin hubs, competition between long-range and short-range multi-way contacts, and condensates of multiple CTCF anchors. Author summary The importance of DNA looping is often mentioned as the initiation step of gene expression. However, there are growing evidences that 'chromatin hubs' comprised of multiple genes and enhancers play vital roles in gene expressions and regulations. Currently a number of experimental techniques to detect and identify multi-way chromosome interactions are available; yet detection of such multi-body interactions is statistically challenging. This study proposes a method to predict multi-way chromatin contacts from pair-wise contact frequencies available in Hi-C dataset. Since chromosomes are made of polymer chains, the pairwise contact probabilities are not entirely independent from each other, but certain types of correlations are present reflecting the underlying chromosome structure. We extract these correlations hidden in Hi-C dataset by leveraging theoretical argument based on polymer physics.
What problem does this paper attempt to address?