Comparative Study of Two Different Strategies for Determination of Soluble Solids Content of Apples from Multiple Geographical Regions by Using FT-NIR Spectroscopy.

Guangzhao Tian,Xiaona Li,Baohua Zhang,Jun Zhou,Baoxing Gu
DOI: https://doi.org/10.1109/access.2019.2958841
IF: 3.9
2019-01-01
IEEE Access
Abstract:Apple is one of the most popular fresh fruits with an extensive scope of regions owing to its nutrition and sweet in flavor. There is a large difference in the composition of the fruits growing in varying regions because of the variation in the growing regions, such as temperature, soil nutrients, etc. As a result, it is of significance to decrease the impact of region variability on the measurement of soluble solids content (SSC) in apples. To lessen the impact of region variability and enhance the predictive ability of on the model, our manuscript compared the performance of the two multi-region prediction models for the estimation of SSC in apples from multiple geographical regions. One multi-region prediction model was developed by merging SSC values and spectral data of all samples from multiple regions. The other multi-region prediction model was built for the determination of SSC in combination with region discriminant, model search strategy, and single-region models. Support vector machine (SVM) was applied to establish the model for discriminating the apples from multiple geographical regions. It was found that the region discriminant model achieved great results, with the classification accuracy of 99.52%. By comparing and analyzing the two multi-region prediction models, the optimal multi-region prediction model was obtained. Finally, to decrease the irrelevant spectral information and reduce the computational cost, the multi-region SSC prediction model was optimized in combination with various spectral preprocessing methods (multiple scatter correction (MSC), standard normal variate (SNV), and first derivative (FD) correction) and variable selection methods (Monte Carlo uninformative variables elimination (MC-UVE), competitive adaptive reweighted sampling (CARS), and random frog (RF)). The overall results denoted that it was more accurate to estimate SSC in apples from the different geographical regions by using the multi-region models based on the region discriminant model in combination with SNV preprocessing algorithm and MC-UVE variable selection algorithm, and the prediction accuracy preceded the single-region models.
What problem does this paper attempt to address?