Accurate Loop Calling for 3D Genomic Data with Cloops

Yaqiang Cao,Zhaoxiong Chen,Xingwei Chen,Daosheng Ai,Guoyu Chen,Joseph McDermott,Yi Huang,Xiaoxiao Guo,Jing-Dong J. Han
DOI: https://doi.org/10.1093/bioinformatics/btz651
IF: 5.8
2019-01-01
Bioinformatics
Abstract:Sequencing-based 3D genome mapping technologies can identify loops formed by interactions between regulatory elements hundreds of kilobases apart. Existing loop-calling tools are mostly restricted to a single data type, with accuracy dependent on a pre-defined resolution contact matrix or called peaks, and can have prohibitive hardware costs. Here we introduce cLoops (‘see loops’) to address these limitations. cLoops is based on the clustering algorithm cDBSCAN that directly analyzes the paired-end tags (PETs) to find candidate loops and uses a permuted local background to estimate statistical significance. These two data-type-independent processes enable loops to be reliably identified for both sharp and broad peak data, including but not limited to ChIA-PET, Hi-C, HiChIP and Trac-looping data. Loops identified by cLoops showed much less distance-dependent bias and higher enrichment relative to local regions than existing tools. Altogether, cLoops improves accuracy of detecting of 3D-genomic loops from sequencing data, is versatile, flexible, efficient, and has modest hardware requirements, and is freely available at: https://github.com/YaqiangCao/cLoops .
What problem does this paper attempt to address?