Comparative evaluation of soil DNA extraction kits for long read metagenomic sequencing

Harry T Child,Lucy Wierzbicki,Gabrielle R Joslin,Richard K Tennant
DOI: https://doi.org/10.1099/acmi.0.000868.v3
2024-09-27
Abstract:Metagenomics has been transformative in our understanding of the diversity and function of soil microbial communities. Applying long read sequencing to whole genome shotgun metagenomics has the potential to revolutionise soil microbial ecology through improved taxonomic classification, functional characterisation and metagenome assembly. However, optimisation of robust methods for long read metagenomics of environmental samples remains undeveloped. In this study, Oxford Nanopore sequencing using samples from five commercially available soil DNA extraction kits was compared across four soil types, in order to optimise read length and reproducibility for comparative long read soil metagenomics. Average extracted DNA lengths varied considerably between kits, but longer DNA fragments did not translate consistently into read lengths. Highly variable decreases in the length of resulting reads from some kits were associated with poor classification rate and low reproducibility in microbial communities identified between technical repeats. Replicate samples from other kits showed more consistent conversion of extracted DNA fragment size into read length and resulted in more congruous microbial community representation. Furthermore, extraction kits showed significant differences in the community representation and structure they identified across all soil types. Overall, the QIAGEN DNeasy PowerSoil Pro Kit displayed the best suitability for reproducible long-read WGS metagenomic sequencing, although further optimisation of DNA purification and library preparation may enable translation of higher molecular weight DNA from other kits into longer read lengths. These findings provide a novel insight into the importance of optimising DNA extraction for achieving replicable results from long read metagenomic sequencing of environmental samples.
What problem does this paper attempt to address?