Quality of presence data determines species distribution model performance: a novel index to evaluate data quality

Songlin Fei,Feng Yu
DOI: https://doi.org/10.1007/s10980-015-0272-7
IF: 5.043
2015-09-11
Landscape Ecology
Abstract:ContextSpecies distribution models (SDMs) are widely used to estimate species’ potential distribution at landscape to regional scales. However, the quality of occurrence data is often compromised by sampling bias, which could raise serious concerns on model accuracy.ObjectivesWe propose a model-independent composite measure—representativeness and completeness (RAC) index—to evaluate the quality of species occurrence data. We demonstrate (1) the impact of spatial data quality as measured by RAC on model performance and (2) the feasibility of applying RAC in actual modeling process.MethodsBy using a set of computational experiments on a virtual species, we calculated RAC values for a set of occurrence data representing different degrees of sampling biases. We evaluated model performance (reliability and accuracy) and associated model performance with RAC values. Two case studies were also conducted to demonstrate the association between RAC and model performance.ResultsModel reliability stabilizes when RAC reaches a threshold of 0.4. Model accuracy stabilizes when RAC reaches 0.4 or 0.5 for models with or without complete predictors, respectively. Model performance is more sensitive to data completeness than representativeness. Our case studies further demonstrated that RAC value is closely related to model performance.ConclusionsPerformance of SDMs is closely related to the quality of species occurrence data, which can be measured by our RAC index. We recommend a minimum RAC value of 0.4 for reliable and accurate SDM predictions. To improve prediction accuracy, sampling with multiple centers in a systematic fashion across the environmental space is desired.
ecology,geography, physical,geosciences, multidisciplinary
What problem does this paper attempt to address?