You shall know a species by the company it keeps: Leveraging co‐occurrence data to improve ecological prediction

Andrew Siefert,Daniel C. Laughlin,Francesco Maria Sabatini
DOI: https://doi.org/10.1111/jvs.13314
IF: 2.8
2024-11-09
Journal of Vegetation Science
Abstract:Making predictions about species, including how they respond to environmental change, is a central challenge for ecologists. Here we show that GloVe, an unsupervised learning algorithm originally designed for language modelling, can be used to encode the information contained in species co‐occurrence data and explain species range shifts more reliably compared with models including only functional traits or phylogenetic information. Aim Making predictions about species, including how they respond to environmental change, is a central challenge for ecologists. Because of the huge number of species, ecologists seek generalizations based on species' traits and phylogenetic relationships, but the predictive power of trait‐based and phylogenetic models is often low. Species co‐occurrence patterns may contain additional information about species' ecological attributes not captured by traits or phylogenies. We propose using a novel ordination technique to encode the information contained in species co‐occurrence data in low‐dimensional vectors that can be used to represent species in ecological prediction. Method We present an efficient method to derive species vectors from co‐occurrence data using Global Vectors for Word Representation (GloVe), an unsupervised learning algorithm originally designed for language modelling. To demonstrate the method, we used GloVe to generate vectors for nearly 40,000 plant species using co‐occurrence statistics derived from sPlotOpen, an open‐access global vegetation plot database, and tested their ability to predict elevational range shifts in European montane plant species. Results Co‐occurrence‐based species vectors were weakly correlated with traits or phylogeny, indicating that they encode unique information about species. Models including co‐occurrence‐based vectors explained twice as much variation in species range shifts as models including only traits or phylogenetic information. Conclusions Given the widespread availability of species occurrence data, species vectors learned from co‐occurrence patterns are a widely applicable and powerful tool for encoding ecological information about species, with many potential applications for describing and predicting the ecology of species, communities and ecosystems.
ecology,plant sciences,forestry
What problem does this paper attempt to address?