Mark-Spectra: A Convolutional Neural Network for Quantitative Spectral Analysis Overcoming Spatial Relationships.

Yueting Wang,Minzan Li,Ronghua Ji,Minjuan Wang,Yao Zhang,Lihua Zheng
DOI: https://doi.org/10.1016/j.compag.2021.106624
IF: 8.3
2021-01-01
Computers and Electronics in Agriculture
Abstract:Spectral analysis is one of the most important and widely used methods for chemometrics in the field of agriculture, and convolutional neural network (CNN) models have achieved excellent performance on spectral analysis. The critical drawback of the CNN approach is that it preserves the spatial relationships among adjacent wavelengths, which contribute to collinearity and redundancies rather than relevant effective information. To confirm this observation, the distribution of characteristic wavelengths extracted by different methods (include F-test, importance weights, and CNN) are visualized in this paper. A convolutional neural network for quantitative spectral analysis, named Mark-Spectra, is presented to overcome spatial relationships and to improve the model performance. A layer (Mark layer) is introduced as part of Mark-Spectra, which is used to overcome spatial relationship of raw spectral data. Mark-Spectra model is compared with three CNN models using three open accessed visible and near infrared spectroscopic datasets (corn, wheat and soil). Mark-Spectra model outperforms the other three convolutional neural network models on two datasets (except dataset of wheat, due to lesser number of features), and it cost much less training time than the others. In addition, this paper compares Mark-Spectra with two classical neural network-based algorithms, principal component analysis - artificial neural network (PCA-ANN) and extreme learning machine (ELM). Mark-Spectra performed best in soil dataset, and ELM performed best in corn and wheat datasets, respectively. These results can illustrate that Mark-Spectra is still limited with the characteristic of raw spectral data (e.g., the number of samples and features), which is a fundamental fact of deep learning-based methods, but it performed better than the other CNN models and reduced the dependence of sample size due to overcoming spatial relationships.
What problem does this paper attempt to address?