Prediction of Soil Organic Carbon Content in Complex Vegetation Areas Based on CNN-LSTM Model

Zhaowei Dong,Liping Yao,Yilin Bao,Jiahua Zhang,Fengmei Yao,Linyan Bai,Peixin Zheng
DOI: https://doi.org/10.3390/land13070915
IF: 3.905
2024-06-24
Land
Abstract:Synthesizing bare soil pictures in regions with complex vegetation is challenging, which hinders the accuracy of predicting soil organic carbon (SOC) in specific areas. An SOC prediction model was developed in this study by integrating the convolutional neural network and long and short-term memory network (CNN-LSTM) algorithms, taking into consideration soil-forming factors such as climate, vegetation, and topography in Hainan. Compared with common algorithmic models (random forest, CNN, LSTM), the SOC prediction model based on the CNN-LSTM algorithm achieved high accuracy (R2 = 0.69, RMSE = 6.06 g kg−1, RPIQ = 1.96). The model predicted that the SOC content ranged from 5.49 to 36.68 g kg−1, with Hainan in the central and southern parts of the region with high SOC values and the surrounding areas with low SOC values, and that the SOC was roughly distributed as follows: high in the mountainous areas and low in the flat areas. Among the four models, CNN-LSTM outperformed LSTM, CNN, and random forest models in terms of R2 accuracy by 11.3%, 23.2%, and 53.3%, respectively. The CNN-LSTM model demonstrates its applicability in predicting SOC content and shows great potential in complex areas where obtaining sample data is challenging and where SOC is influenced by multiple interacting factors. Furthermore, it shows significant potential for advancing the broader field of digital soil mapping.
environmental studies
What problem does this paper attempt to address?
The main problem this paper attempts to address is the challenge of predicting soil organic carbon (SOC) content in areas with complex vegetation. Specifically, due to vegetation cover in the Hainan region, it is difficult to obtain bare soil images that cover the entire study area, which limits the accuracy of SOC predictions and makes digital soil mapping (DSM) more difficult. Additionally, existing models have limited capabilities in handling time series inputs, which restricts the potential of using multi-source remote sensing information to quantify SOC content. To address these challenges, the authors developed an SOC prediction model based on Convolutional Neural Networks and Long Short-Term Memory networks (CNN-LSTM), aiming to combine the spatial feature extraction capabilities of CNNs with the time series processing capabilities of LSTMs to improve the accuracy of SOC content predictions in areas with complex vegetation. The objectives of this study include: 1. Exploring the potential of the CNN-LSTM model in digital soil mapping in complex areas; 2. Evaluating the performance of this model compared to Random Forest (RF), CNN, and LSTM models; 3. Clarifying the spatial correlation between soil-forming environmental variables and SOC content.