Abstract:Reconstruction of complete bacterial genomes is a vital aspect of microbial research, as it provides complex information about genetic content, gene ontology, and regulation. It has become a domain of 3rd generation, long-read sequencing platforms, as short-read technologies can deliver mainly fragmented genomes. PacBio platform can provide high-quality complete genomes, yet remains one of the most expensive sequencing strategies. Oxford Nanopore Technology (ONT) offers the advantage of producing the longest reads, being at the same time the most cost-effective option in terms of platform costs, as well as library preparation, and sequencing. However, ONTs error rate, although significantly reduced lately, still holds a certain level of distrust in the scientific community. In recent years, hybrid assembly of Nanopore and Illumina data has been used to solve ONTs issue with error rate and has yielded the best results in terms of genome completeness, quality, and price. However, the latest advancements in Nanopore technology, including new flow cells (R10.4.1), new library preparation chemistry (V14) and duplex-mode, updated basecallers (Dorado v0.4.1), and the realization that sequencing in dark mode results in significantly increased throughput, have had a significant impact on the quality of generated data and, thus, the recovery of complete genomes by ONT sequencing alone. In this study, we compared the data generated by ONT using three sequencing strategies (Native barcoding, RAPID barcoding, and custom-developed: BARSEQ) against PacBio and Illumina (NextSeq) as well as Illumina-ONT hybrid data. For this purpose, we employed three strains of the actinobacteria , whose genomes have been proven difficult to reconstruct due to high GC content, regions of repeated sequences and massive genome rearrangements. Our data indicate that DNA libraries prepared with the native barcoding kit, sequenced with V14 chemistry on R10.4.1 flow cell, and assembled with Flye resulted in the reconstruction of complete genomes of overall quality highly similar to that of genomes reconstructed with PacBio. The highest level of quality can be achieved by hybrid assembly of data from the Native barcoding kit complemented with data from custom-developed BARSEQ, both sequenced on R10.4.1 flow cell. In conclusion, our results demonstrate that ONT can be used as a cost-effective sequencing strategy, without the need for complementing with other sequencing technologies, for the reconstruction of complete genomes of the highest quality.

Performance of neural network basecalling tools for Oxford Nanopore sequencing

Basecalling Using Joint Raw and Event Nanopore Data Sequence-to-Sequence Processing

MSRCall: A Multi-scale Deep Neural Network to Basecall Oxford Nanopore Sequences

A method for filtering abnormal modified base calling in Oxford Nanopore Technologies sequencing

Benchmarking reveals superiority of deep learning variant callers on bacterial nanopore sequence data

NanoReviser: an Error-correction Tool for Nanopore Sequencing Based on a Deep Learning Algorithm

Leveraging Basecaller's Move Table to Generate a Lightweight k-mer Model

Evaluation of the accuracy of bacterial genome reconstruction with Oxford Nanopore R10.4.1 long-read-only sequencing

Matching Excellence: ONT’s Rise to Parity with PacBio in Genome Reconstruction of Non-Model Bacterium with High GC Content

An Error Correction Method of Nanopore Sequencing Data Using Deep Learning

Lokatt: a hybrid DNA nanopore basecaller with an explicit duration hidden Markov model and a residual LSTM network

Systematic Comparison of the Performances of De Novo Genome Assemblers for Oxford Nanopore Technology Reads From Piroplasm

Matching excellence: Oxford Nanopore Technologies' rise to parity with Pacific Biosciences in genome reconstruction of non-model bacterium with high G+C content

BaseNet: A Transformer-Based Toolkit for Nanopore Sequencing Signal Decoding

Correcting modification-mediated errors in nanopore sequencing by nucleotide demodification and reference-based correction

Scalable Nanopore sequencing of human genomes provides a comprehensive view of haplotype-resolved variation and methylation

NanoSNP: a progressive and haplotype-aware SNP caller on low-coverage nanopore sequencing data

From squiggle to basepair: computational approaches for improving nanopore sequencing read accuracy

Effective training of nanopore callers for epigenetic marks with limited labelled data