Study on the method and model of rice quality monitoring based on hyperspectral data
Shibo Yan,Xiuzhen Wang,Jingfeng Huang,Jia Liu,Limin Wang
DOI: https://doi.org/10.1109/Agro-Geoinformatics.2016.7577640
2016-01-01
Abstract:Crude protein and amylose content are 2 important indexes of rice quality, the aim of the study is to explore an appropriate method and model to monitor the rice quality by using hyperspectral data. We colleceted samples in 2013 and 2014 when rice entered the mature period in the east of Deqing county and aquired the hyperspectral data of 4 diffenent forms of rice, including ear of rice, paddy, rice grain and rice flour, then we acquired crude protein and amylose content of rice flour by using NY/T 3-1982 and NY/T 83-1988. We chose 400nm–2400nm to anylyze and used 9 points weighted moving average method to smooth the spectra, then derived the samples into 2 parts, 70 samples used for builing model and 36 samples used for validation. Results showed that the paddy had the best correlation between spectra (R) and crude protein or amylose content, its spectra of 612nm showed a best correlation with crude protein content (r=-0.5561) and 1409nm with amylose content (r=-0.482), it's possible that the smooth degree influenced the spectra of rice grain and the density influenced the spectra of rice flour, so we choose paddy's original spectra to acquire more spectral varibles. In terms of spectra transform, we used the derivative transform (R'), logarithmic transform (Lg(1/R)) and continuum removal methods (Rr). Comparing R, R', Lg(1/R)and Rr, the R'441 (r=-0.6417) had a best correlation with crude protein and R'792 (r=-0.5549) had a best correlation with amylose content, used these 2 spectral varibles to buid the single factor model (liner, exponential, parabolic) and used the partial least squares regression (PLSR) to explore the correlation between R' and the 2 indexes (PLSR+R'). From the single factor model, exponential model has a highst r
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup>
(0.4364) and highest RMSE (0.7756%) in crude protein content and the parabolic model also has a highest r
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup>
(0.3485) and highest RMSE (2.9952%) in amylose content, so PLSR+R' was more suitable for building model (r
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup>
=0.5945, RMSE=0.4192% in crude protein content and r
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup>
=0.6062, RMSE=1.8401% in amylose content). Use validation samples to check all above models, the r
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup>
between measured and estimated data are all lower than modeling samples, PLSR+R' also had a highest r
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup>
(r
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup>
=0.291 in crude protein content and r
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup>
=0.3786 in amylose content) but the highest r
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup>
of signal factor model was only 0.2333 in crude protein content (exponential model) and 0.1573 in amylose content (parabolic model). In conclusion, the most appropriate method is to acquire the hyperspectral data of paddy and use PLSR+R' to monitor the rice quality.