A deep learning-based method enables the automatic and accurate assembly of chromosome-level genomes

Zijie Jiang,Zhixiang Peng,Zhaoyuan Wei,Jiahe Sun,Yongjiang Luo,Lingzi Bie,Guoqing Zhang,Yi Wang
DOI: https://doi.org/10.1093/nar/gkae789
IF: 14.9
2024-09-19
Nucleic Acids Research
Abstract:The application of high-throughput chromosome conformation capture (Hi-C) technology enables the construction of chromosome-level assemblies. However, the correction of errors and the anchoring of sequences to chromosomes in the assembly remain significant challenges. In this study, we developed a deep learning-based method, AutoHiC, to address the challenges in chromosome-level genome assembly by enhancing contiguity and accuracy. Conventional Hi-C-aided scaffolding often requires manual refinement, but AutoHiC instead utilizes Hi-C data for automated workflows and iterative error correction. When trained on data from 300+ species, AutoHiC demonstrated a robust average error detection accuracy exceeding 90%. The benchmarking results confirmed its significant impact on genome contiguity and error correction. The innovative approach and comprehensive results of AutoHiC constitute a breakthrough in automated error detection, promising more accurate genome assemblies for advancing genomics research.
biochemistry & molecular biology
What problem does this paper attempt to address?