Species distribution models for invasive Eurasian watermilfoil highlight the importance of data quality and limitations of discrimination accuracy metrics

Shyam M. Thomas,Michael R. Verhoeven,Jake R. Walsh,Daniel J. Larkin,Gretchen J. A. Hansen
DOI: https://doi.org/10.1002/ece3.8002
IF: 3.167
2021-08-13
Ecology and Evolution
Abstract:AimAvailability of uniformly collected presence, absence, and abundance data remains a key challenge in species distribution modeling (SDM). For invasive species, abundance and impacts are highly variable across landscapes, and quality occurrence and abundance data are critical for predicting locations at high risk for invasion and impacts, respectively. We leverage a large aquatic vegetation dataset comprising point-level survey data that includes information on the invasive plant Myriophyllum spicatum (Eurasian watermilfoil) to: (a) develop SDMs to predict invasion and impact from environmental variables based on presence–absence, presence-only, and abundance data, and (b) compare evaluation metrics based on functional and discrimination accuracy for presence–absence and presence-only SDMs. LocationMinnesota, USA. MethodsEurasian watermilfoil presence–absence and abundance information were gathered from 468 surveyed lakes, and 801 unsurveyed lakes were leveraged as pseudoabsences for presence-only models. A Random Forest algorithm was used to model the distribution and abundance of Eurasian watermilfoil as a function of lake-specific predictors, both with and without a spatial autocovariate. Occurrence-based SDMs were evaluated using conventional discrimination accuracy metrics and functional accuracy metrics assessing correlation between predicted suitability and observed abundance. ResultsWater temperature degree days and maximum lake depth were two leading predictors influencing both invasion risk and abundance, but they were relatively less important for predicting abundance than other water quality measures. Road density was a strong predictor of Eurasian watermilfoil invasion risk but not abundance. Model evaluations highlighted significant differences: Presence–absence models had high functional accuracy despite low discrimination accuracy, whereas presence-only models showed the opposite pattern. Main conclusionComplementing presence–absence data with abundance information offers a richer understanding of invasive Eurasian watermilfoil's ecological niche and enables evaluation of the model's functional accuracy. Conventional discrimination accuracy measures were misleading when models were developed using pseudoabsences. We thus caution against the overuse of presence-only models and suggest directing more effort toward systematic monitoring programs that yield high-quality data.
ecology,evolutionary biology
What problem does this paper attempt to address?