Abstract:BackgroundThree dimensional (3D) genome spatial organization is critical for numerous cellular functions, including transcription, while certain conformation-driven structural alterations are frequently oncogenic. Genome conformation had been difficult to elucidate but the advent chromatin conformation capture assays, notably Hi-C, has transformed understanding of chromatin architecture and yielded numerous biological insights. Although most of these findings have flowed from analysis of proximity data produced by these assays, added value in generating 3D reconstructions has been demonstrated, deriving, in part, from superposing genomic features on the reconstruction. However, advantages of 3D structure-based analyses are clearly conditional on the accuracy of the attendant reconstructions, which is difficult to assess. Proponents of competing reconstruction algorithms have evaluated their accuracy by recourse to simulation of toy structures and/or limited fluorescence in situ hybridization (FISH) imaging that features a handful of low resolution probes. Accordingly, new methods of reconstruction accuracy assessment are needed.ResultsHere we utilize two recently devised assays to develop methodology for assessing 3D reconstruction accuracy. Multiplex FISH increases the number of probes by an order of magnitude and hence the number of inter-probe distances by two orders, providing sufficient information for structure-level evaluation via mean-squared deviations (MSD). Crucially, underscoring multiplex FISH applications are large numbers of coordinate-system aligned replicates that provide the basis for a referent distribution for MSD statistics. Using this system we show that reconstructions based on Hi-C data for IMR90 cells are accurate for some chromosomes but not others. The second new assay, genome architecture mapping, utilizes large numbers of thin cryosections to obtain a measure of proximity. We exploit the planarity of the cryosections – not used in inferring proximity – to obtain measures of reconstruction accuracy, with referents provided via resampling. Application to mouse embryonic stem cells shows reconstruction accuracies that vary by chromosome.ConclusionsWe have developed methods for assessing the accuracy of 3D genome reconstructions that exploit features of recently advanced multiplex FISH and genome architecture mapping assays. These approaches can help overcome the absence of gold standards for making such assessments which are important in view of the considerable uncertainties surrounding 3D genome reconstruction.

Integrating Hi-C links with assembly graphs for chromosome-scale assembly

Benchmarking of Hi-C tools for scaffolding de novo genome assemblies

EndHiC: assemble large contigs into chromosomal-level scaffolds using the Hi-C links from contig ends

Comparison of Hi-C-Based Scaffolding Tools on Plant Genomes

Puzzle Hi-C: an accurate scaffolding software

A deep learning-based method enables the automatic and accurate assembly of chromosome-level genomes

YaHS: yet another Hi-C scaffolding tool

Assembly of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data

Benchmarking multi-platform sequencing technologies for human genome assembly

Benchmarking of next and third generation sequencing technologies and their associated algorithms for de novo genome assembly

Pre-Assembly NGS Correction of ONT Reads Achieves HiFi-Level Assembly Quality

GrapHiC: An integrative graph based approach for imputing missing Hi-C reads

Three invariant Hi-C interaction patterns: applications to genome assembly

New algorithms for accurate and efficient de novo genome assembly from long DNA sequencing reads

HALC: High throughput algorithm for long read error correction

DBG2OLC: Efficient Assembly of Large Genomes Using Long Erroneous Reads of the Third Generation Sequencing Technologies

Robust haplotype-resolved assembly of diploid individuals without parental data

LDscaff: LD-based scaffolding of de novo genome assemblies

Chrom-pro: A User-Friendly Toolkit for De-novo Chromosome Assembly and Genomic Analysis

Hi-C 2.0: An Optimized Hi-C Procedure for High-Resolution Genome-Wide Mapping of Chromosome Conformation

Improved accuracy assessment for 3D genome reconstructions