Spectral Fusion Modeling for Soil Organic Carbon by a Parallel Input-Convolutional Neural Network

Yongsheng Hong,Songchao Chen,Bifeng Hu,Nan Wang,Jie Xue,Zhiqing Zhuo,Yuanyuan Yang,Yiyun Chen,Jie Peng,Yaolin Liu,Abdul Mounem Mouazen,Zhou Shi
DOI: https://doi.org/10.1016/j.geoderma.2023.116584
IF: 6.1
2023-01-01
Geoderma
Abstract:Visible-to-near-infrared (vis-NIR) and mid-infrared (MIR) spectroscopy have been widely utilized for the quantitative estimation of soil organic carbon (SOC). The fusion of vis-NIR and MIR data can be hypothesized to provide accurate and reliable prediction for SOC because spectral data within a specific range of each individual sensor may lack important absorptive features associated with SOC. In this study, six data fusion strategies, principally direct concatenation-partial least squares regression (DC-PLSR), outer product analysis-PLSR (OPAPLSR), OPA-competitive adaptive reweighted sampling-PLSR (OPA-CARS-PLSR), sequentially orthogonalizedPLSR (SO-PLSR), DC-convolutional neural network (DC-CNN), and parallel input-CNN (PI-CNN), were compared for the spectral estimations of SOC. The data fusion and individual sensor models were developed using soil samples collected from Zhejiang Province, East China, and scanned under laboratory conditions with both vis-NIR and MIR spectrophotometers. The validation results of vis-NIR (validation coefficient of determination [R2] = 0.63-0.73) were generally better than those of MIR (validation R2 = 0.45-0.59). For data fusion, the best validation accuracy was achieved by the PI-CNN (validation R2 = 0.84), followed in descending order by DC-CNN (validation R2 = 0.78), SO-PLSR (validation R2 = 0.73), OPA-CARS-PLSR (validation R2 = 0.69), OPAPLSR (validation R2 = 0.66), and DC-PLSR (validation R2 = 0.64). The better performance of PI-CNN over DCCNN demonstrates the necessity of using different sizes of convolutional kernels before feeding into the fully connected layers in the CNN network for fusing vis-NIR and MIR spectral data. The deep-learning fusion method based on PI-CNN can be considered an efficient tool for integrating data from multiple sensors for estimating soil properties in the field of soil spectral modeling.
What problem does this paper attempt to address?