Methodology for Assessing the Quality of Genomic Assembly Based on the Analysis of K-Mers Frequency in a Parallel Sequencing Sequencer

Andrey G. Borodinov,Vladimir V. Manoilov,Igor V. Zarutskiy,Alexander I. Petrov,Vladimir E. Kurochkin
DOI: https://doi.org/10.1109/apeie52976.2021.9647640
2021-11-19
Abstract:Counting the occurrence of different k-mers often arises in problems of genome assembly. Analysis of the frequency distribution of k-mers makes it possible to find assembly errors in already formed contigs. Currently, in connection with the development of instrumentation for genetic analysis, there is an urgent need to develop methods for assessing the quality of genomic assembly. Such techniques will make it possible to assess the reliability of genetic analysis in existing and newly developed devices. In this work, based on the analysis of various software tools, programs are selected that allow assessing the quality of genomic assembly in parallel sequencing sequencers. Using the selected programs, the data obtained on the domestic sequencer for parallel sequencing Nanofor SPS were processed. Based on the results of processing these data, the quality of the genomic assembly was assessed by the method of k-mers analysis and recommendations for improving the hardware and software of the Nanofor SPS device were given.
What problem does this paper attempt to address?