Accurate genotyping of three major respiratory bacterial pathogens with ONT R10.4.1 long-read sequencing

Nora Zidane,Carla Rodrigues,Valerie Bouchez,Marthin Rethoret-Pasty,Virginie Passet,Sylvain Brisse,Chiara Crestani
DOI: https://doi.org/10.1101/2024.10.03.616467
2024-10-03
Abstract:High-throughput massive parallel sequencing has significantly improved bacterial pathogen genomics, diagnostics, and epidemiology. Despite its high accuracy, short-read sequencing struggles with complete genome reconstruction and assembly of extrachromosomal elements such as plasmids. Long-read sequencing with Oxford Nanopore Technologies (ONT) presents an alternative that offers benefits like real-time sequencing and cost-efficiency, particularly useful in resource-limited settings. However, the higher error rates of ONT have so far limited its application in high-precision genomic typing. The recent release of ONT's R10.4.1 chemistry, with significantly improved raw read accuracy (Q20+), offers a potential solution to this problem. The aim of this study was to evaluate the performance of ONT's latest chemistry for bacterial genomic typing against the gold standard Illumina technology, focusing on three respiratory pathogens of public health importance, Klebsiella pneumoniae, Bordetella pertussis, and Corynebacterium diphtheriae, and their related species. Using the Rapid Barcoding Kit V14, we generated and analyzed genome assemblies with different basecalling tools and models, at different simulated depths of coverage. ONT assemblies were compared to the Illumina reference for completeness and core genome multilocus sequence typing (cgMLST) accuracy (number of allelic mismatches). Our results show that genomes obtained from raw data basecalled with Dorado (with both simplex and duplex reads) SUP v0.7.1, assembled with Flye, and with a minimum coverage depth of 30X, optimized the accuracy for all bacterial species tested. The error rates were consistently below 1% of each cgMLST scheme, indicating that ONT R10.4.1 data is suitable for high-resolution genomic typing applied to outbreak investigations and public health surveillance.
Genomics
What problem does this paper attempt to address?