Estimating chromosome sizes from karyotype images enables validation of de novo genome assemblies

Martin Pippel, Arne Ludwig, Alexandr Dibrov, Eugene Myers
2023-01-26
Abstract:Owing to decreased costs over time, the genomes of an increasing number of species are being sequenced. However, chromosome-scale genome assembly is affected by scaffolding errors, resulting in incorrect chromosome sizes. Here, we present KICS, a semi-automated and cost-efficient approach for examining the validity of the assembly by estimating relative chromosome sizes from karyotype images. The method employs threshold-based image segmentation and uses the extracted chromosome areas as proxies for the actual chromosome sizes. A strong correlation between chromosomal area and DNA confirmed the suitability of our approach, as assessed from karyotype images of multiple species. KICS can be applied to microchromosomes, and we identified assembly errors made by HiC sequencing in the horseshoe bat genome. By using the human genome as a reference, for which telomere-to-telomere data are available, we estimate an error of our tool of∼ 6Mb. We foresee that KICS will be routinely used as an inexpensive and intuitive tool to validate the de novo assembly of new genomes.
What problem does this paper attempt to address?