How eDNA data filtration, sequence coverage, and primer selection influence assessment of fish communities in northern temperate lakes

Erik García‐Machado,Eric Normandeau,Guillaume Côté,Louis Bernatchez
DOI: https://doi.org/10.1002/edn3.444
2023-06-29
Environmental DNA
Abstract:We evaluate species' minimum read proportion threshold, sequence coverage, and primer choice to optimize fish monitoring biodiversity in temperate lakes. Our results suggested: first, using a species minimum read proportion threshold of 0.001 to consider a species as valid limiting false positive; second, sequencing depth had a limited effect on eDNA detection, nonetheless, we suggest using higher sequencing depths to increase rare species detectability and improve the relative abundance estimation of such species; third, the 12S MiFish‐U primer set is the best choice to survey fish communities in north‐temperate lakes. For nearly 15 years now, environmental DNA has demonstrated its effectiveness in monitoring biodiversity. Methodological and technical improvements have significantly enhanced the field. However, the effect of factors such as sequence coverage, bioinformatic filtration, and primer choice have been less explored or need to be optimized according to specific survey objectives and study site characteristics. We evaluated these factors to help optimize monitoring fish biodiversity in North American temperate lakes. We sampled water for fish community eDNA analysis in 12 lakes from southwestern Québec, Canada. The lakes were selected to encompass a wide range of surface areas and species richness. We sampled water from a total of 520 sites (25–50 per lake) and analyzed three mitochondrial DNA regions (12S rRNA; 16S rRNA; and cytb) using NovaSeq sequencing. Our results, based on rarefied count matrices (from a sequencing depth of 100,000 to a minimum depth of 1000 reads per sample), showed that keeping only species in each sample if they represented at least one thousandth (species minimum read proportion threshold = 0.001) of the sample's reads was adequate to remove false positives and had a limited negative impact on true positives with low read counts. The sequencing depth was found to have a negligible impact on the accuracy of fish community assessment in a given lake. With the same sequencing depth and a complete local reference database for each primer set, a single primer set produced similar species richness medians than the combination of two or three primer sets. Overall, 12S and 16S detected more species and provided more consistent community profiles than cytb. Based on our observations, we suggest using the 12S MiFish‐U primer set and applying a minimum proportion of 0.001 reads per species and site to monitor north‐temperate lentic freshwater fish communities.
What problem does this paper attempt to address?