Abstract:Hi-C is a genome-wide chromosome conformation capture technology that detects interactions between pairs of genomic regions and exploits higher order chromatin structures. Conceptually Hi-C data counts interaction frequencies between every position in the genome and every other position. Biologically functional interactions are expected to occur more frequently than transient background and artefactual interactions. To identify biologically relevant interactions, several background models that take biases such as distance, GC content and mappability into account have been proposed. Here we introduce MaxHiC, a background correction tool that deals with these complex biases and robustly identifies statistically significant interactions in both Hi-C and capture Hi-C experiments. MaxHiC uses a negative binomial distribution model and a maximum likelihood technique to correct biases in both Hi-C and capture Hi-C libraries. We systematically benchmark MaxHiC against major Hi-C background correction tools including Hi-C significant interaction callers (SIC) and Hi-C loop callers using published Hi-C, capture Hi-C, and Micro-C datasets. Our results demonstrate that 1) Interacting regions identified by MaxHiC have significantly greater levels of overlap with known regulatory features (e.g. active chromatin histone marks, CTCF binding sites, DNase sensitivity) and also disease-associated genome-wide association SNPs than those identified by currently existing models, 2) the pairs of interacting regions are more likely to be linked by eQTL pairs and 3) more likely to link known regulatory features including known functional enhancer-promoter pairs validated by CRISPRi than any of the existing methods. We also demonstrate that interactions between different genomic region types have distinct distance distributions only revealed by MaxHiC. MaxHiC is publicly available as a python package for the analysis of Hi-C, capture Hi-C and Micro-C data. MaxHiC is a robust machine learning based tool for identifying significant interacting regions from both Hi-C and capture Hi-C data. All the current existing models are designed for either Hi-C or capture Hi-C data, however we developed MaxHiC to be applicable for both Hi-C and capture Hi-C libraries (two different models have been used for Hi-C and capture Hi-C data). MaxHiC is also able to analyse very deep Hi-C libraries (e.g., Micro-C) without any computational issues. MaxHiC significantly outperforms current existing Hi-C significant interaction callers and even Hi-C loop callers in terms of enrichment of interactions between known regulatory regions as well as biologically relevant interactions.

MUNIn: A statistical framework for identifying long-range chromatin interactions from multiple samples

Fasthic: A Fast and Accurate Algorithm to Detect Long-Range Chromosomal Interactions from Hi-C Data

HiC-ACT: improved detection of chromatin interactions from Hi-C data via aggregated Cauchy test

MaxHiC: robust estimation of chromatin interaction frequency in Hi-C and capture Hi-C experiments

A hidden Markov random field-based Bayesian method for the detection of long-range chromosomal interactions in Hi-C data.

DPNuc: Identifying Nucleosome Positions Based on the Dirichlet Process Mixture Model

High resolution discovery of chromatin interactions

HiCHub: A Network-Based Approach to Identify Domains of Differential Interactions from 3D Genome Data

MAPS: Model-Based Analysis of Long-Range Chromatin Interactions from PLAC-seq and HiChIP Experiments

MHiC, an integrated user-friendly tool for the identification and visualization of significant interactions in Hi-C data

Unified Analysis of Multiple ChIP-Seq Datasets

Modeling and Visualizing Heterogeneity of Spatial Patterns of Protein-DNA Interaction from High-Density Chromatin Precipitation Mapping Data

ChIP-based methods for the identification of long-range chromatin interactions.

MaxHiC: A robust background correction model to identify biologically relevant chromatin interactions in Hi-C and capture Hi-C experiments

SnapHiC-G: identifying long-range enhancer-promoter interactions from single-cell Hi-C data via a global background model

Multiplex chromatin interactions with single-molecule precision

Hi-Tag: a simple and efficient method for identifying protein-mediated long-range chromatin interactions with low cell numbers

Statistical confidence estimation for Hi-C data reveals regulatory chromatin contacts

A supervised learning framework for chromatin loop detection in genome-wide contact maps

NOMe-HiC: joint profiling of genetic variant, DNA methylation, chromatin accessibility, and 3D genome in the same DNA molecule

Statistical Models for Detecting Differential Chromatin Interactions Mediated by a Protein.