Website visits can predict angler presence using machine learning

Julia S. Schmid,Sean Simmons,Mark A. Lewis,Mark S. Poesch,Pouria Ramazi
2024-09-26
Abstract:Understanding and predicting recreational fishing activity is important for sustainable fisheries management. However, traditional methods of measuring fishing pressure, such as surveys, can be costly and limited in both time and spatial extent. Predictive models that relate fishing activity to environmental or economic factors typically rely on historical data, which often restricts their spatial applicability due to data scarcity. In this study, high-resolution angler-generated data from an online platform and easily accessible auxiliary data were tested to predict daily boat presence and aerial counts of boats at almost 200 lakes over five years in Ontario, Canada. Lake-information website visits alone enabled predicting daily angler boat presence with 78% accuracy. While incorporating additional environmental, socio-ecological, weather and angler-generated features into machine learning models did not remarkably improve prediction performance of boat presence, they were substantial for the prediction of boat counts. Models achieved an R2 of up to 0.77 at known lakes included in the model training, but they performed poorly for unknown lakes (R2 = 0.21). The results demonstrate the value of integrating angler-generated data from online platforms into predictive models and highlight the potential of machine learning models to enhance fisheries management.
Physics and Society,Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to predict recreational fishing activities using machine - learning methods, especially to predict the number and presence of fishing boats through website visit data. Specifically, the research aims to: 1. **Evaluate the ability of machine - learning models based on angler - generated data (such as website traffic) to predict the daily presence of fishing boats in known lakes**. The study found that using only the feature of "website traffic in the past seven days", the model can achieve a prediction accuracy rate of 78%. 2. **Explore whether adding additional environmental, socio - ecological and weather data can significantly improve the prediction performance**. The results show that although these additional data have a significant impact on predicting the number of boats, they do not significantly improve the prediction of the presence of boats. 3. **Verify whether these models can be used to predict fishing activities in unknown lakes**. The study found that the performance of the model on unknown lakes is poor, and for the prediction of the number of boats, the R² value is only 0.21. ### Research Background Traditional methods of measuring fishing pressure (such as on - site surveys and aerial counts) are costly and have limited time and space coverage. Prediction models usually rely on historical data, which limits their application in different regions. Therefore, researchers try to use real - time data generated by online platforms and easily accessible auxiliary data to train machine - learning models to improve the accuracy and practicality of prediction. ### Main Findings - **Website traffic** is the most important feature for predicting the presence of fishing boats in known lakes, and its importance is about two to three times that of other features. - **Length of the lake shoreline** and **distance from urban areas** are also important predictive features. - **Random forest** and **gradient - boosted regression trees** are the best - performing machine - learning models, and they perform well in dealing with complex, nonlinear relationships. - For the prediction of unknown lakes, the performance of the model is poor, especially when predicting the number of boats. ### Significance This study shows that combining angler - generated data with machine - learning methods can improve the prediction ability of recreational fishing activities to a certain extent, which is helpful for the sustainable management of fishery resources. However, the performance of the model on unknown lakes still needs to be improved. Future research can consider adding more features, such as water quality, fish population size, etc., to further improve the accuracy and applicability of prediction.