Using Ensemble Learning for Remote Sensing Inversion of Water Quality Parameters in Poyang Lake

Changchun Peng,Zhijun Xie,Xing Jin
DOI: https://doi.org/10.3390/su16083355
IF: 3.9
2024-04-17
Sustainability
Abstract:Inland bodies of water, such as lakes, play a crucial role in sustaining life and supporting ecosystems. However, with the rapid development of socio-economics, water resources are facing serious pollution problems, such as the eutrophication of water bodies and degradation of wetlands. Therefore, the monitoring, management, and protection of inland water resources are particularly important. In past research, empirical models and machine learning models have been widely used for the water quality assessment of inland lakes. Due to the complexity of the optical properties of inland lake water bodies, the performance of these models is often limited. To overcome the limitations of these models, this study uses in situ water quality data from 2017 to 2018 and multispectral (MS) remote sensing data from Sentinel-2 to construct experimental samples of Poyang Lake. Based on these experimental samples, we constructed a spatio-temporal ensemble model (STE) to evaluate four common water quality parameters: chlorophyll-a (Chl-a), total phosphorus (TP), total nitrogen (TN), and chemical oxygen demand (COD). The model adopts an ensemble learning strategy, improving the model's performance by merging multiple advanced machine learning algorithms. We introduced several indices related to water quality parameters as auxiliary variables, such as NDCI and Enhanced Three, and used band data and these auxiliary variables as predictive variables, thereby greatly enhancing the predictive potential of the model.The results show that the inversion accuracy of these four inversion models is high (R2 of 0.94, 0.88, 0.92, and 0.93; RMSE of 1.15, 0.01, 0.02, and 0.02; MAE of 0.81, 0.01, 0.09, and 0.10), indicating that the STE model has good evaluation accuracy. Meanwhile, we used the STE model to reveal the spatio-temporal distribution of Chl-a, TP, TN, and COD from 2017 to 2018, and analyzed their seasonal and spatial variation rules. The results of this study not only provide an effective and practical method for monitoring and managing water quality parameters in inland lakes, but also provide water security for socio-economic and ecological environmental safety.
environmental sciences,environmental studies,green & sustainable science & technology
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve the complexity problems in the remote sensing inversion of water quality parameters in Poyang Lake. Specifically, the author focuses on how to use the integrated learning method to improve the accuracy of monitoring and evaluating water quality parameters in inland lakes. The following is a detailed explanation of the paper's objectives: 1. **Propose a spatio - temporal ensemble model (STE) based on multiple machine - learning methods**: - By combining advanced machine - learning algorithms (such as XGBoost, LightGBM, and CatBoost), an integrated learning model is constructed to enhance the robustness of the model. - This model can handle the complex optical properties of inland lake waters and provide more accurate prediction results at different temporal and spatial scales. 2. **Utilize Sentinel - 2 image data with high spatio - temporal resolution**: - By combining the water quality parameter data of Poyang Lake and Sentinel - 2 image data, the spatio - temporal distribution patterns of four common water quality parameters (chlorophyll - a (Chl - a), total phosphorus (TP), total nitrogen (TN), and chemical oxygen demand (COD)) in Poyang Lake from 2017 to 2018 are constructed. - Analyze the monthly, seasonal, and spatial variation characteristics of these water quality parameters to provide a scientific basis for water quality monitoring in Poyang Lake. 3. **Verify the feasibility and advantages of the STE model in multi - spatio - temporal scenarios**: - By comparing with traditional single models, prove the application advantages of the integrated learning method in complex water bodies. - Provide references and guidance for water quality management and ecological protection in Poyang Lake. ### Background and significance With the rapid development of the social economy, water resources are facing serious pollution problems, such as eutrophication of water bodies and wetland degradation. Therefore, the monitoring, management, and protection of inland water resources have become particularly important. Traditional water quality monitoring methods are time - consuming and costly, and it is difficult to achieve dynamic monitoring. Remote sensing technology has become an effective means of monitoring water quality parameters because of its advantages of high - spatial - coverage and high - time - frequency. However, the optical properties of inland lake waters are very complex, and the existing empirical models and machine - learning models have limitations in performance. For this reason, this paper proposes a spatio - temporal ensemble model (STE) based on integrated learning, aiming to overcome these limitations and improve the inversion accuracy of water quality parameters. ### Main contributions - **Model innovation**: Proposed an integrated learning model that combines multiple machine - learning algorithms, which improves the robustness and prediction accuracy of the model. - **Data fusion**: Utilized Sentinel - 2 image data with high spatio - temporal resolution and in - situ water quality parameter data to construct a detailed spatio - temporal distribution map of water quality parameters. - **Application prospects**: Provided new methods and technical support for water quality monitoring and management in Poyang Lake and other similar water bodies. Through these studies, the author hopes to provide a scientific basis for water quality control and improvement in Poyang Lake, maintain the water ecological balance, and promote the safe development of the social economy and ecological environment.