Prediction of chlorophyll a and risk assessment of water blooms in Poyang Lake based on a machine learning method

Huadong Huang,Jing Zhang
DOI: https://doi.org/10.1016/j.envpol.2024.123501
IF: 8.9
2024-03-16
Environmental Pollution
Abstract:Four different methods were used to identify the important factors influencing chlorophyll-a (Chl-a) content: correlation analysis (CC-NMI), principal component analysis (PCA), decision tree (DT), and random forest recursive feature elimination (RF-RFE). Considering the relationship between Chl-a and its active and passive factors, we established machine learning combination models based on multiple linear regression (MLR), multi-layer perceptron (MLP), and support vector regression (SVR) to predict Chl-a content for Poyang Lake, China. Then, the predictive effects of different combination models were compared and evaluated from multiple perspectives. Considering the actual needs for eutrophication prevention and control, the concept of risk probability was then introduced to assess the risk degree of risk associated with water blooms in Poyang Lake. The results indicated that the mean R 2 for the Chl-a predictions using the MLR, MLP, and SVR models was 0.21, 0.61, and 0.75, respectively. Consequently, the SVR model demonstrated higher precision and more accurate predictions. Compared to other methods, integrating the SVR model with the RF-RFE method significantly improved the prediction accuracy, with the R 2 increasing to 0.94. For Poyang Lake, 8.8% of random samples indicated a low risk level with a water bloom probability of 21.1%–36.5%; one sample indicated a medium risk level with a risk probability of 45.5%. The research results offer valuable insights for predicting eutrophication and conducting risk assessments for Poyang Lake. They also provide reliable scientific support for making decisions about eutrophication in lakes and reservoirs. Therefore, the results hold significant theoretical importance, practical value, and potential for widespread application.
environmental sciences
What problem does this paper attempt to address?