Abstract:Small mammal species play an important role influencing vegetation primary productivity and plant species composition, seed dispersal, soil structure, and as predator and/or prey species. Species which experience population dynamics cycles can, at high population phases, heavily impact agricultural sectors and promote rodent-borne disease transmission. To better understand the drivers behind small mammal distributions and abundances, and how these differ for individual species, it is necessary to characterise landscape variables important for the life cycles of the species in question. In this study, a suite of Earth observation derived metrics quantifying landscape characteristics and dynamics, and in-situ small mammal trapline and transect survey data, are used to generate random forest species distribution models for nine small mammal species for study sites in Narati, China and Sary Mogul, Kyrgyzstan. These species distribution models identify the important landscape proxy variables driving species abundance and distributions, in turn identifying the optimal conditions for each species. The observed relationships differed between species, with the number of landscape proxy variables identified as important for each species ranging from 3 for Microtus gregalis at Sary Mogul, to 26 for Ellobius tancrei at Narati. Results indicate that grasslands were predicted to hold higher abundances of Microtus obscurus, E. tancrei and Marmota baibacina, forest areas hold higher abundances of Myodes centralis and Sorex asper, with mixed forest-grassland boundary areas and areas close to watercourses predicted to hold higher abundances of Apodemus uralensis and Sicista tianshanica. Localised variability in vegetation and wetness conditions, as well as presence of certain habitat types, are also shown to influence these small mammal species abundances. Predictive application of the Random Forest (RF) models identified spatial hot-spots of high abundance, with model validation producing R2 values between 0.670 for M. gregalis transect data at Sary Mogul to 0.939 for E. tancrei transect data at Narati. This enhances previous work whereby optimal habitat was defined simply as presence of a given land cover type, and instead defines optimal habitat via a combination of important landscape dynamic variables, moving from a human-defined to species-defined perspective of optimal habitat. The species distribution models demonstrate differing distributions and abundances of host species across the study areas, utilising the strengths of Earth observation data to improve our understanding of landscape and ecological linkages to small mammal distributions and abundances.

Impacts of Sample Ratio and Size on the Performance of Random Forest Model to Predict the Potential Distribution of Snail Habitats

The predictive performances of random forest models with limited sample size and different species traits

Effects of sample size on the performance of species distribution models

Deciphering ecology from statistical artefacts: Competing influence of sample size, prevalence and habitat specialization on species distribution models and how small evaluation datasets can inflate metrics of performance

Effects of sample size on accuracy of species distribution models

The effect of sample size and species characteristics on performance of different species distribution modeling methods

Why choose Random Forest to predict rare species distribution with few samples in large undersampled areas? Three Asian crane species models provide supporting evidence

Study on method of determination of appropriate sample size of Oncomelania hupensis in marshland and lake regions

Quantifying Effects of Habitat Heterogeneity and Other Clustering Processes on Spatial Distributions of Tree Species

Mapping small mammal optimal habitats using satellite-derived proxy variables and species distribution models

Minimum habitat size required to detect new rare species

Effect of sample number and location on accuracy of land use regression model in NO2 prediction

Precision mapping of snail habitat in lake and marshland areas: Integrating environmental and textural indicators using Random Forest modeling

Evaluating trade-offs in spatial versus temporal replication when estimating avian community composition and predicting species distributions

The interplay of various sources of noise on reliability of species distribution models hinges on ecological specialisation

No optimal spatial filtering distance for mitigating sampling bias in ecological niche models

[Preliminary Study on Applying High Resolution CBERS Images to Identify Oncomelania Snail Habitats in Lake and Marshland Regions].

Quality of presence data determines species distribution model performance: a novel index to evaluate data quality

Comparative study of sampling strategies for machine learning-based landslide susceptibility assessment

Prospective sampling based on model ensembles improves the detection of rare species

Environmental filtering improves ecological niche models across multiple scales