Multi-Scale and Multimodal Species Distribution Modeling

Nina van Tiel,Robin Zbinden,Emanuele Dalsasso,Benjamin Kellenberger,Loïc Pellissier,Devis Tuia
2024-11-06
Abstract:Species distribution models (SDMs) aim to predict the distribution of species by relating occurrence data with environmental variables. Recent applications of deep learning to SDMs have enabled new avenues, specifically the inclusion of spatial data (environmental rasters, satellite images) as model predictors, allowing the model to consider the spatial context around each species' observations. However, the appropriate spatial extent of the images is not straightforward to determine and may affect the performance of the model, as scale is recognized as an important factor in SDMs. We develop a modular structure for SDMs that allows us to test the effect of scale in both single- and multi-scale settings. Furthermore, our model enables different scales to be considered for different modalities, using a late fusion approach. Results on the GeoLifeCLEF 2023 benchmark indicate that considering multimodal data and learning multi-scale representations leads to more accurate models.
Machine Learning
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is: How to effectively integrate multi - scale and multi - modal geospatial data in Species Distribution Models (SDMs) to improve model performance. Specifically, the author focuses on the following aspects: 1. **Influence of spatial scale**: - Research the influence of different spatial scales (i.e., the size of image patches) on model performance. - Explore whether the appropriate image patch size will affect the performance of the model, and whether this influence varies with species or environmental types. 2. **Fusion of multi - modal data**: - Combine different data sources (such as bioclimatic variables and satellite images), and take into account the different resolutions and scales of these data sources through the late fusion method. - Evaluate whether multi - modal data can improve the accuracy of the model. 3. **Application of deep learning**: - Use deep learning (especially Convolutional Neural Networks, CNNs) to process and analyze geospatial data. - Compare the performance differences between traditional SDM methods and deep - learning - based methods. ### Main research content - **Unimodal model**: Use bioclimatic variables and satellite images as inputs respectively to study the model performance at different scales. - **Multimodal model**: Combine bioclimatic variables and satellite images to explore the influence of multi - scale feature extraction on model performance. - **Experimental verification**: Use the GeoLifeCLEF 2023 benchmark data set for experiments to evaluate the performance of different models in species prediction. ### Key findings - **Small scale is more suitable for bioclimatic variables**: For bioclimatic variables, a smaller spatial scale (such as 1×1 pixel) usually performs best. - **Multi - scale improves the performance of satellite image models**: For satellite images, multi - scale models significantly improve the prediction performance. - **Multimodal models have the best effect**: Multimodal models that combine bioclimatic variables and satellite images show the best performance in both species - and site - level evaluations. Through these studies, the author has demonstrated the importance of multi - scale and multi - modal data in species distribution modeling and provided valuable references for future related research.