Discrimination of cracked soybean seeds by near-infrared spectroscopy and random forest variable selection

Liusan Wang,Ziliang Huang,Rujing Wang
DOI: https://doi.org/10.1016/j.infrared.2021.103731
IF: 2.997
2021-01-01
Infrared Physics & Technology
Abstract:The presence of crack in soybean seeds reduces the quality of soybean seeds. Therefore, it is essential to assess the quality of the seeds before storage and sowing. The current visual inspection methods to precisely select cracked soybean seeds are subjective, inconsistent and slow, and chemical methods are destructive and time consuming. In this study, a discrimination of cracked soybean seeds method by near-infrared spectroscopy and random forest variable selection is proposed. Two hundred soybean seeds spectra were acquired using an FT-NIR spectrometer. One hundred and fifty soybean seeds (seventy-five normal and seventy-five cracked) were applied for training and validation sets, fifty soybean seeds (twenty-five normal and twenty-five nature cracked) applied for a test set. Principal component analysis (PCA) and random forest (RF) were used to assess the spectral data from the FTNIR spectrometer. Moreover, three random forest variable selection methods, namely recursive feature elimination (REF), Boruta and VarSelRF algorithms, were applied. The classification accuracy of 80% was achieved for random forest in the test set. The mainly wavenumber variables selected by the three variable selection algorithms were all around the wavenumbers 7066 and 10,522cm(-1). Among the three random forest variable selection algorithms, the performance of REF algorithm was superior, an accuracy of 84% was achieved in the test set. The selected variables combined with the results of RF models demonstrated the major contributors to classify the cracked and normal soybean seeds are moisture, amorphous cellulose and fiber content on the absorption spectrum. The results of present study demonstrated the proposed method can be used for detecting cracked soybean seeds.
What problem does this paper attempt to address?