Cost-efficient PCR based DNA barcoding of marine invertebrate specimens with NovaSeq amplicon sequencing

Genki Kobayashi,Hirokazu Abe
DOI: https://doi.org/10.1007/s11033-024-09811-z
2024-08-06
Abstract:Background: The marine environment harbors high biodiversity; however, it is poorly understood. Nucleotide sequence data of all marine organisms should be accumulated before natural and/or anthropogenic environmental changes jeopardize the marine environment. In this study, we report a cost-effective and easy DNA barcoding method. This method can be readily adopted without using library preparation kits. It includes multiplex PCR of short targets, indexing PCR, and outsourcing to a sequencing service using the NovaSeq system. Methods and results: We targeted four mitochondrial genes [cytochrome c oxidase subunit I (COI), COIII, 16S rRNA (16S), and 12S rRNA (12S)] and three nuclear genes [18S rRNA (18S), 28S rRNA (28S), internal transcribed spacer 2 (ITS2)] in 95 marine invertebrate specimens, which were primarily annelids. The primers, including adapters and indices for NovaSeq sequencing, were newly designed. Two PCR runs were conducted. The 1st PCR amplified specific loci with universal primers and the 2nd added sequencing adapters and indices to the 1st PCR products. The gene sequences obtained from the FASTQ files were subjected to BLAST search and phylogenetic analyses. One run using 95 specimens yielded sequences averaging 2816 bp per specimen for a total length of six loci. Nuclear genes were more successfully assembled compared with mitochondrial genes. A weak but significantly negative correlation was observed between the average length of each locus and success rate of the assembly. Some of the sequences were almost identical to the sequences obtained from specimens collected far from Japan, indicating the presence of potentially invasive species identified for the first time. Conclusions: We obtained gene sequences efficiently using next-generation sequencing rather than Sanger sequencing. Although this method requires further optimization to increase the success rate for some loci, it is used as a first step to select specimens for further analyses by determining the specific loci of the targets.
What problem does this paper attempt to address?