Abstract:Researchers developed chromosome capture methods such as Hi-C to better understand DNA's 3D folding in nuclei. The Hi-C method captures contact frequencies between DNA segment pairs across the genome. When analyzing Hi-C data sets, it is common to group these pairs using standard bioinformatics methods (e.g., PCA). Other approaches handle Hi-C data as weighted networks, where connected node represent DNA segments in 3D proximity. In this representation, one can leverage community detection techniques developed in complex network theory to group nodes into mesoscale communities containing similar connection patterns. While there are several successful attempts to analyze Hi-C data in this way, it is common to report and study the most typical community structure. But in reality, there are often several valid candidates. Therefore, depending on algorithm design, different community detection methods focusing on slightly different connectivity features may have differing views on the ideal node groupings. In fact, even the same community detection method may yield different results if using a stochastic algorithm. This ambiguity is fundamental to community detection and shared by most complex networks whenever interactions span all scales in the network. This is known as community inconsistency. This paper explores this inconsistency of 3D communities in Hi-C data for all human chromosomes. We base our analysis on two inconsistency metrics, one local and one global, and quantify the network scales where the community separation is most variable. For example, we find that TADs are less reliable than A/B compartments and that nodes with highly variable node-community memberships are associated with open chromatin. Overall, our study provides a helpful framework for data-driven researchers and increases awareness of some inherent challenges when clustering Hi-C data into 3D communities.

Sequence-based Multiscale Model (SeqMM) for High-throughput chromosome conformation capture (Hi-C) data analysis

Sequence-based multiscale modeling for high-throughput chromosome conformation capture (Hi-C) data analysis

Multiscale molecular modeling of chromatin with MultiMM: From nucleosomes to the whole genome

Development of Multiomics in Situ Pairwise Sequencing (Mip-Seq) for Single-cell Resolution Multidimensional Spatial Omics

Multiscale differential geometry learning of networks with applications to single-cell RNA sequencing data

BHi-Cect: a top-down algorithm for identifying the multi-scale hierarchical structure of chromosomes

Mapping robust multiscale communities in chromosome contact networks

A Comparative Study for Identifying the Chromosome-Wide Spatial Clusters from High-Throughput Chromatin Conformation Capture Data

Examining dynamics of three-dimensional genome organization with multi-task matrix factorization

Integrative Modeling of 3D Genome Organization by Bayesian Molecular Dynamics Simulations with Hi-C Metainference

miniMDS: 3D structural inference from high-resolution Hi-C data

Inferring Spatial Organization of Individual Topologically Associated Domains via Piecewise Helical Model.

Multi-scaling Hierarchical Structure Analysis on the Sequence of E. Coli Complete Genome

Hi-C-guided many-polymer model to decipher 3D genome organization

Multiscale Theory and Computational Method for Biomolecule Simulations

Sci-Hi-C: A single-cell Hi-C method for mapping 3D genome organization in large number of single cells

Qb-13016-lj 156..174

Multi-scale Poisson process approaches for differential expression analysis of high-throughput sequencing data

Exploring 3D community inconsistency in human chromosome contact networks

Detection of colinear blocks and synteny and evolutionary analyses based on utilization of MCScanX

The Advancement of Analysis Methods of Chromosome Conformation Capture Data