Classification of Orange Growing Locations Based on the Near-infrared Spectroscopy Using Data Mining.

Songjian Dan,Simon X. Yang,Fengchun Tian,Lie Den
DOI: https://doi.org/10.1080/10798587.2015.1095474
2015-01-01
Intelligent Automation & Soft Computing
Abstract:The classification of growing locations is very important for quality control in the orange industries, which is also challenging work, because of its complex chemical composition and varies of taste and sizes. The traditional ways to classify them by human's sense are time consuming and at high cost. In this paper, a new general classification framework based on the Near-Infrared Reflection ( NIR) spectroscopy using data mining technology was proposed. First, the raw NIR spectra data were reduced by the principal components analysis ( PCA), and then an attribution selection method was applied to find the best feature subset. An evolution process was also introduced to test the performance of five classifiers ( Decision Tree, KNN, Naive Bayesian, SVM and ANN) used in this paper. The proposed classification framework was verified on three NIR spectra datasets, which were collected from the different part of oranges ( including two parts of fruit surface and juice) from 15 different places in china. The experimental results demonstrated that the juice NIR spectra is the most suitable data-set for identifying the orange growing locations, and the decision tree is the best and most stable classifier, which could achieve the highest average prediction rate of 96.66%.
What problem does this paper attempt to address?