ENT3C: an entropy-based similarity measure for Hi-C and micro-C derived contact matrices

Xenia Lainscsek,Leila Taher
DOI: https://doi.org/10.1101/2024.01.30.577923
2024-03-07
Abstract:Hi-C and micro-C sequencing have shed light on the profound importance of 3D genome organization in cellular function by probing 3D contact frequencies across the linear genome. The resulting contact matrices are extremely sparse and susceptible to technical- and sequence-based biases, making their comparison challenging. The development of reliable, robust and efficient methods for quantifying similarity between contact matrix is crucial for investigating variations in the 3D genome organization between different cell types or under different conditions, as well as evaluating experimental reproducibility. We present a novel method, ENT3C, which measures the change in pattern complexity in the vicinity of contact matrix diagonals to quantify their similarity. ENT3C provides a robust, user-friendly Hi-C or micro-C contact matrix similarity metric and a characteristic entropy signal that can be used to gain detailed biological insights into 3D genome organization.
Bioinformatics
What problem does this paper attempt to address?
The paper introduces a new method called ENT3C for quantifying the similarity of chromatin interaction matrices obtained from Hi-C and micro-C sequencing. Comparing these matrices is challenging due to their sparsity and susceptibility to technical biases and sequence biases. ENT3C quantifies the similarity by measuring the complexity variation near the diagonal signal, providing a robust and user-friendly similarity metric that can be used to deepen the understanding of biological differences in 3D genome organization.