A workflow for practical training in ecological genomics using Oxford Nanopore long-read sequencing

Robert Foster,Heleen De Weerd,Nathan Medd,Tim Booth,Caitlin Newman,Helen Ritch,Javier Santoyo-Lopez,Urmi Trivedi,Alex D. Twyford
DOI: https://doi.org/10.1101/2024.09.03.610948
2024-09-04
Abstract:Long-read single molecule sequencing technologies continue to grow in popularity for genome assembly and provide an effective way to resolve large and complex genomic variants. However, uptake of these technologies for teaching and training is hampered by the complexity of high molecular weight DNA extraction protocols, the time required for library preparation and the costs for sequencing, as well as challenges with downstream data analyses. Here, we present a full long-read workflow optimised for teaching, that covers each stage from DNA extraction, to library preparation and sequencing, to data QC and genome assembly and characterisation, that can be completed in under two weeks. We use a specific case study of plant identification, where students identify an anonymous plant sample by sequencing and assembling the genome and comparing it to other samples and to reference databases. In testing, long-read genome skimming of nine wild-collected plant species extracted with a modified kit-based approach produced an average of 8Gb of Oxford Nanopore data, enabling the complete assembly of plastid genomes, and partial assembly of nuclear genomes. In the classroom, all students were able to complete the protocols, and to correctly identify their plant samples based on BOLD searches of barcoding loci extracted from the plastid genome, coupled with phylogenetic analyses of whole plastid genomes. We supply all the learning material and raw data allowing this to be adapted to a range of teaching settings.
Genomics
What problem does this paper attempt to address?