Wavelength selection and spectral discrimination for paddy rice, with laboratory measurements of hyperspectral leaf reflectance
Shalei Song,Wei Gong,Bo Zhu,Xin Huang
DOI: https://doi.org/10.1016/j.isprsjprs.2011.05.002
IF: 12.7
2011-01-01
ISPRS Journal of Photogrammetry and Remote Sensing
Abstract:Highlights ► We select 552, 675, 705 and 776 nm to discriminate rice leaves under nitrogen stress. ► We select 1158, 1378 and 1965 nm to discriminate rice leaves under irrigation stress. ► These selected narrow bands contained the majority information of the rice leaf. ► The selected wavelengths have great potential use for the designing of future sensor. Abstract The objective of this research is to select the most sensitive wavelengths for the discrimination of the imperceptible spectral variations of paddy rice under different cultivation conditions. The paddy rice was cultivated under four different nitrogen cultivation levels and three water irrigation levels. There are 2151 hyperspectral wavelengths available, both in hyperspectral reflectance and energy space transformed spectral data. Based on these two data sets, the principal component analysis (PCA) and band-band correlation methods were used to select significant wavelengths with no reference to leaf biochemical properties, while the partial least squares (PLS) method assessed the contribution of each narrow band to leaf biochemical content associated with each loading weight across the nitrogen and water stresses. Moreover, several significant narrow bands and other broad bands were selected to establish eight kinds of wavelength (broad-band) combinations, focusing on comparing the performance of the narrow-band combinations instead of broad-band combinations for rice supervising applications. Finally, to investigate the capability of the selected wavelengths to diagnose the stress conditions across the different cultivation levels, four selected narrow bands (552, 675, 705 and 776 nm) were calculated and compared between nitrogen-stressed and non-stressed rice leaves using linear discriminant analysis (LDA). Also, wavelengths of 1158, 1378 and 1965 nm were identified as the most useful bands to diagnose the stress condition across three irrigation levels. Results indicated that good discrimination was achieved. Overall, the narrow bands based on hyperspectral reflectance data appear to have great potential for discriminating rice of differing cultivation conditions and for detecting stress in rice vegetation; these selected wavelengths also have great potential use for the designing of future sensors. Keywords Hyperspectral data Wavelength selection PCA Band-band correlation PLS Rice 1 Introduction Given the present and increasing requests to devise high-yielding cultivation techniques for good-quality super hybrid rice, there is an urgent need to develop rigorous plans and procedures for its growing environment impact inspection and assessment. This has motivated the widespread use of remote sensing technologies for monitoring the status of rice growth. The almost imperceptible variations in rice under different cultivation conditions will have varying effects on the hyperspectral reflectance; thus it is essential to develop techniques for remotely quantifying the structure, distribution and health of the rice crop. Previous studies over the past decades have successfully used hyperspectral data to quantify the canopy characteristics of crops; some researchers found that leaf spectral reflectance increases in portions of the visible and very-near infrared range as a plant experiences physiological stress ( Carter, 1994; Carter and Knapp, 2001; Hansen and Schjoerring, 2003 ). However, few such studies have been conducted on the monitoring of rice leaf growing with environmental stress, especially nitrogen deficiency, which leads to reduced canopy density (vigour) and premature yellowing of the foliage (chlorosis) in autumn; as does poor irrigation, which would lead to leaf metabolic insufficiency. The work presented in this paper aims to understand how the symptoms of environmental stress are manifested in the hyperspectral reflectance of each wavelength at the leaf level and, moreover, how to extract the wavelengths that can recognize and discriminate the growth status of paddy rice. Hyperspectral remote sensing technologies have allowed the development of an increasing number of spectral bands and, consequently, an improved capability for gaining a greater understanding of the fundamental processes that govern changes in the biophysical/biochemical properties of vegetation ( Renzullo et al., 2006 ). Researchers have often attempted to establish a causal link between measured spectral reflectance and the foliar biochemical composition and/or plant physiology ( Shibayama and Akiyama, 1989; Yoder and Pettigrew-Crosby, 1995; Curran et al., 1997, 2001; Blackburn, 1999; Sims and Gamon, 2002; Coops et al., 2003 ), and their ability to discriminate different species ( Kleynen et al., 2003; Huang and Zhang, 2009 ). How to identify the key spectral regions of interest, which could explain the variations observed in the biochemical difference, as represented in the spectral reflectance, is the fundamental research objective. Because of the high correlation inherent in adjacent wavelengths, a number of band selection methods have been developed and documented in remote sensing literature. However, no single best approach is yet available to determine the optimal number of wavelengths for the best estimate of rice characteristics. Most wavelength selection methods can be classified into two categories. The first is usually conducted using material chemistry with multiple linear regressions (e.g. stepwise regression) on object spectra. This approach provides the best linear spectral combinations to assess chemical concentration ( Gastellu-Etchegorry and Bruniquel-Pinel, 2001 ); past research has dictated the use of various ratio indices ( Aoki et al., 1981; Carter, 1994; Lyon et al., 1998 ), derivatives of reflectance spectra ( Elvidge and Chen, 1995; Thenkabail et al., 2000, 2002 ), and a linear mixture modelling approach ( Maas, 2000; Huang and Zhang, 2008 ). The other kinds of approach usually reduce the number of wavelengths by recursively applying a feature transformation such as principal component analysis (PCA) in a stepwise fashion and removing identified ‘noisy’ bands ( Csillag et al., 1993 ). Such approaches exploit the interdependence of bands to form groups from neighbouring bands, or define complex decision boundaries for the classification of high-dimensional data, such as neural networks ( Thenkabail et al., 2004 ). The spectral wavelength selection strategies all have benefits and drawbacks. Their success depends on the training sample size, the number of desirable components/regions and the type of spectral data to which they are applied. In order to have a comprehensive comparison of the band performance of rice hyperspectral data, we adapted representative waveband selection methods, both in parametric and nonparametric ways, and with different numbers of extracted components. As the magnitude of change in spectral reflectance in response to stress will vary at different wavelengths, it is still a question of whether and how hyperspectral data can be used to unambiguously detect physiological stress in rice. The selected narrow bands provide useful information in the interpretation, by remote sensing monitoring, of rice growing with environmental stresses. In addition, the technique can be a considered as a valuable tool for the selection of a sensor suitable for a particular problem, or even for the design of a new sensor. Our aim was to evaluate whether physiological stress in rice produces a distinct spectral signature in the leaves. The objectives of our study were: (i) to investigate the capability of hyperspectral data to distinguish the leaves of healthy (appropriately cultivated and irrigated) versus physiologically stressed rice; (ii) to develop an optimal narrow-band selection method and compare the performances of representative band selection methods for establishing different waveband combinations (including broad-band and narrow-band) to achieve the first objective; and (iii) to investigate the capability of the selected significant narrow bands to distinguish the leaves under different cultivation conditions. 2 Data description 2.1 Study areas and site description The study areas were located at an experimental paddy field at Junchuan town, Suizhou city, Hubei province, China. The area is known as the Jianghan plain, and is in the middle reaches of the Yangze River; it is known as ‘the hometown of fish and rice’ in China, providing food security and acting as the most important agricultural production base in China. At the present time, a great change from high-yielding to super-quality and high-yielding is appearing in the rice varieties being cultivated in Hubei province. Obviously, the trend of rice production in China is that of developing super-quality rice industrialization. The paddy variety being studied is Luoyou 8, which is one of the three most advanced rice varieties in China ( Fig. 1 ), and has been successfully promoted in some other important rice production countries such as Vietnam and Brazil. The rice was seeded on 15 May, and seedlings were transplanted on 15 June. It was cultivated in 4 × 3 cases of different fertilized conditions, which means a total of 12 treatments during the whole growing period: four nitrogen fertilization levels combined with three water irrigation levels. The four nitrogen fertilization levels were: appropriate (180 kg/ha); insufficient (135 kg/ha); excessive (225 kg/ha) and no nitrogen. All of the four levels were fertilized in four stages: 50% of the total fertilizer as base fertilization, 20% at booting stage, 20% fertilized at tillering stage, and 10% at heading stage. Besides the fertilization controls, ridges of the paddy field were enclosed in plastic films to avoid water leakage, and the three treatments of water irrigation were: insufficient, appropriate and excessive. The details of the three irrigation levels are listed in Table 1 . The plot size was 6 m × 20 m and each plot type was replicated three times with the same cultivation conditions. The same management practices were implemented for all rice plots (i.e. timing, pest and disease control, etc.). 2.2 Leaf collection and spectral measurements The typical rice plant has a main stem of about 1–1.8 m tall, the leaves growing reversely and alternatively at the two sides of the stem. Leaves at different positions in the stem may exhibit distinctive spectral characteristics. In order to minimize the confounding influence of location on spectral measurements, we stratified the leaf samples collected from each plant by height and then randomly selected 10 plants of each of the 12 treatments for sampling. For each plant, we chose three samples of leaf, comprising one sample of leaf from the upper part of the stem and two samples of leaf from the lower part of the stem, and sampled in three corresponding sub-samples. Subsequent spectral measurements found that the reflectance patterns of leaves collected from upper versus lower heights on the stem did not differ significantly for any of the 12 treatments. Therefore, we used one sub-sample for spectral measurement, and another two sub-samples for biochemical (nitrogen and water) measurements. Mean reflectance of the paddy field under four nitrogen cultivation levels and three irrigation levels were plotted in Fig. 2 . All leaves were collected in June and August 2008. They were immediately sealed in plastic bags, kept in an ice chest, and then transported to the laboratory for spectral measurements. Leaf reflectance was measured with a Field Spec Pro FR (Analytical Spectral Devices Inc., Boulder, USA). The measurement procedure followed that employed by Pu et al. (2003) . The light source was a 100 W halogen reflectorized lamp. All spectra were measured at the nadir direction of the radiometer with a 25 ° F OV. A standard whiteboard was employed as the white reference and measured every five minutes to convert leaf radiance to spectral reflectance. Reflectance spectra of leaves, picked randomly from the upper hemisphere of the leaf, were collected by measuring spots of 10 mm diameter using a plant probe. Spectral measurement was not easy as the rice leaves were long and narrow; we cut each leaf into several pieces, then the leaves were covered on top of a calibrated black board, and care was taken to make sure the field of view was fully occupied. The adaxial surfaces of a sample were measured three times, from which an average spectral reflectance curve was generated. Spectral reflectance was originally measured over the ranges of 350–1000 nm at 1.4 nm intervals and 1000–2500 nm at 2.2 nm intervals. The entire spectral range (350–2500 nm) was automatically resampled to 1 nm resolution. 2.3 Biochemical and physiological data To determine whether the reflectance patterns of leaves from the rice under nitrogen- and water-stressed conditions could be successfully discriminated, the nitrogen content, chlorophyll-a content and water content were measured for each selected leaf of the two sub-samples. The separate biochemical measurements were derived from destructive chemical analysis in the biochemical laboratory. One sub-sample was used for chlorophyll concentration measurements with an acetone (80%) extraction method ( Hernandez et al., 1995 ), and then the micro-Kjeldahl technique with salicylate was used for nitrogen concentration determination. Water content was measured by weighing the selected fresh and dried paddy leaves of the third sub-sample. 3 Methods 3.1 Data preprocessing There were two cases of controlled cultivation, and we measured 10 spectra for each of the 12 treatments. For wavelength selection and classification of rice leaves under these treatments, the measured samples were randomly split into two parts, one of the 4 × 3 × 5 samples were used to analyse and establish the inversion model, and the last 4 × 3 × 5 samples were used for model precision evaluation and discrimination analysis. The hyperspectral curve accurately relates to physical aspects of absorption and reflectance behaviour ( Piech and Piech, 1987 ). However, the measured hyperspectral curve of leaves brings in instrument noise and causes the curve to have structural feature variations; we needed to do some curve transformation before the spectral analysis, in order to describe the spectral curve with more precision and by its most important structural features. Hyperspectral analysts usually display spectra in units of ‘ μ m ’ or ‘nm’ in wavelength; however, plotting the wavenumber in cm - 1 equals the number of waves per unit length (most often expressed in units of cm - 1 ) and eliminates asymmetry due to the display being on a constant interval wavelength abscissa ( Rossman, 1988 ). We can convert reflectance to ‘energy space’, which displays apparent absorbance by taking the following transformation in the physics community. (1) frequency ( cm - 1 ) = 1000 wavelength ( μ m ) According to Brown’s method ( Brown, 2006 ), the base 10 logarithm is the standard in the chemistry and planetary sciences communities, so we converted to apparent absorbance by taking the base 10 logarithm, and multiplied the spectrum by −1 in order to make the absorption features ‘positive’. (2) A ( λ ) = - lg ( R ( λ ) ) where R ( λ ) is the reflectance of each wavelength ( λ ). Then we converted the spectrum to energy space, where A ( υ ) is frequency; the transformed spectra is shown in Fig. 3 . (3) A ( υ ) = 1000 A ( λ ) 3.2 Wavelength selection The hyperspectral data also cause the high correlation between adjacent wavelengths; it was not necessary to include all measured 2151 wavebands in the application at one time. In this paper, in order to make a comprehensive comparison for all possible wavelength selections and, hence, to select the optimal wavelengths that best describe rice characteristics under controlled growing conditions, we used representative methods, including spatial inter-band correlation analysis, principal component analysis (PCA) and partial least squares (PLS) analysis. The most efficient wavelengths were selected by these methods separately, and then all the possible narrow bands were combined in several special ways for regression analysis and later for linear discrimination analysis (LDA). Firstly, inter-band correlation analysis was applied to highlight wavelengths with rich information content from redundant wavelengths. The coefficient of determination ( R 2 ) between all the hyperspectral wavelengths were computed in matrix form. The matrices were plotted against wavelengths. The R 2 models of wavelength ( λ i ) against wavelength ( λ j ) were performed to provide a rigorous search criterion that every single wavelength ( λ i ) was correlated with every other wavelength ( λ j ), leading to λ i - λ j plots (where i , j = 2151 wavelengths). In our hyperspectral data set, we provided a total of 2,311,250 coefficient of determination ( R 2 ) involving all possible wavelength combinations. The criterion of band selection is that the lower the R 2 value, the less the redundancy between two wavebands. According to the criterion, the wavebands corresponding to the first 100 minimum R 2 values were selected from all rice leaf spectra collected from the different cases, and then these bands were analysed. The inter-band correlation analysis would select a rough spectral region with little redundancy, and then the PCA algorithm was performed to compute the contribution to principal components by each wavelength as an indicator of wavelength selection. In the PCA algorithm, the raw spectral reflectance data X = ( x 1 , x 2 , ⋯ x p ) that has a dimension p and the sample number m could be subject to: (4) X m × p = T m × k P k × p + E k where T is a vector of scores, P is the spectral loadings and E k is the residual of the k th principal components. The algorithm extracts one component at a time. Each component is obtained iteratively, by repeated regression of X on T to obtain an improved P ; and of X on P to obtain an improve T . The spectra loadings for each principal component can be viewed separately, or, alternatively, it reflects the correlation between the principal component Y j and the i th wavelength X i . According to this correlation, we can calculate the total contribution of the i th wavelength v i to the k th principal components separately. Therefore, v i is the evidence of wavelength selection by PCA. (5) v i = ∑ j = 1 k P 2 ( Y j , X i ) The two methods (inter-band correlation and PCA) were integrated in order to determine the wavelength with the highest frequency of occurrence in the full spectral range. However, the selected wavelengths with these methods cannot be confirmed with direct measurements of leaf physiological status. We adopted a PLS regression method for narrow-band selection and further for regression assessment. PLS regression is a recently developed generalization of multiple linear regression (MLR) ( Höskuldsson, 1988 ). PLS regression is of particular interest because, unlike MLR, it can analyse data with strongly collinear (correlated), noisy, and numerous X-variables, and also simultaneously model several response variables. PLS regression aims to link the response variable Y (in this paper referring to biochemical content), to the matrix of predictors X (spectral reflectance data) through latent variables (or factors) ( Hansen and Schjoerring, 2003 ). In addition, PLS regression models both the ‘structure’ of X and Y , which gives richer results than the traditional multiple regression approach. PLS regression has the desirable property that the precision of the model parameters improves with the increasing number of relevant variables and observations. It reduces full-spectrum data to a smaller set of independent latent variables or principal components (PCs). As a result, full-spectrum wavelength loadings for significant PLS regression factors, from which regression coefficients for each wavelength are derived, describe the spectral variation most relevant to the modelling of variation in the data ( Nelson et al., 1996 ). The subjection of predicators X and the final model predicting y has the following form: (6) X = t 1 p 1 ′ + ⋯ + t a p a ′ + E a (7) y = t 1 q 1 + ⋯ + t a q a + f a The equations are similar to the PCA algorithm; t is a vector of scores calculated by t a = X a - 1 w a with scaled weights w a = cX a - 1 , y a - 1 , c is the scaling factor, p are the spectral loadings, q the biochemical loadings, and E and f are the predictor and response variable residuals, respectively, of the estimated effect of the a th factor. Thus, the algorithm may be defined successively using the above equations and by incrementing a = 1 , 2 , ⋯ , A . The number of factors to use in the PLS regression model may be determined through leave-one-out cross-validation ( Rao and Wu, 2005 ). The first weight eigenvector w 1 is the first eigenvector of the combined variance–covariance matrix X ′ YY ′ X ; similarly, the first score vector t 1 is an eigenvector to XX ′ YY ′ . The weights give information about how the variables combine to form the quantitative relation between X and Y , thus providing an interpretation of the scores t a . Hence, these weights are essential for the selection of the important X -variables that have large w a values. So the PLS regression weights w a express both the ‘positive’ correlations between X and Y . That is to say, everything varying in X is primarily related to Y , and w a is informative as its interpretation supplies evidence directly in the PLS regression wavelength selection. 3.3 Band combination and regression assessment Together with the principle component analysis (PCA), band-band correlation analysis and the partial least squares (PLS) regression analysis, we could select the wavelengths which have the least spectral redundancy and are highly correlated with rice growing status. However, the effect may not be significant when we combine all the selected wavelengths for the rice remote sensing monitor. That is because some wavelengths might have a similar contribution to rice characteristics when combined together, for example, the wavelengths sensitive to nitrogen content and chlorophyll content, respectively, would have a similar regression effect. In comparison to the performances of these selected wavelengths, we established several wavelength combinations to make a regression analysis and to test its discriminative capability. The regression analysis was based on PLS regression and PCA. The PLS regression algorithm was mentioned above, while for PCA regression analysis we have: (8) C = A 0 + ∑ i = 1 k A i W i = ( 1 , W 1 , ⋯ W k ) ( A 0 , A 1 , ⋯ , A k ) T = XA where W i is the i th component, C is a measured variable of leaf biochemical content, the coefficient matrix A is calculated by least squares. For PLS regression and PCA models, the validation was performed on two data sets (raw data and energy space transformed data) by comparing differences in R 2 , root mean square error (RMSE) and relative percentage deviation (RPD) ( Williams and Norris, 2004 ). RMSE values were calculated according to Eq. (9) . The RPD is the ratio of the standard deviation of the y data to the RMSE of cross-validation predictions. (9) RMSE = ∑ i = 1 n ( y ˆ i - y i ) 2 n (10) RPD = 100 n ∑ i = 1 n y ˆ i - y i y i Where y ˆ i is the predicted value and y i is the measured variables of rice, and n is the sample size. The goodness of fit is given by R 2 and Q 2 (the cross-validation of R 2 ) statistics, which give the restrict bounds, determine how well the model explains the data, and predicts new observations. 3.4 Discrimination between paddy leaves from different cultivation conditions In the wavelength selection processing, we selected the narrow bands that are particularly sensitive indicators of stresses that are caused by nitrogen content and water content. The final aim of this paper is to demonstrate the feasibility of narrow-band combinations as an exploratory measure for the remote sensing supervision of high-yielding cultivation techniques for super hybrid rice. The classifications of two cases were adapted, based on a LDA procedure, for which several other researchers had achieved good discrimination results ( Gong et al., 1997; Van Aardt and Wynne, 2001; Clark et al., 2005 ). The data used in the demonstration are both leaf-level reflectance spectra and energy space transformed spectra of rice leaves. The emphasis is on the ability of methods to separate the groups of rice growing cases with narrow-band combinations under different nitrogen cultivation levels and water irrigation levels. 4 Results and discussions λ i – λ j plots show the very high correlation ( r 2 ) between any two wavebands, indicating rich or redundant information under two cultivation cases ( Fig. 4 a and b). For the nitrogen-controlled cultivation case, the most frequently occurring wavebands included the green, red and NIR from 500 to 850 nm; while for the water-controlled irrigation case, the least redundant spectral region was concentrated in the short-wavelength infrared, which were from 1100 to 2100 nm. The waveband widths and central wavelengths were optimized to provide maximum information and are determined from the λ i – λ j plots. The shade of grey indicates the redundancy between wavelengths. Based on Fig. 4 a, it can be concluded that the visible and ‘red edge’ of the spectrum contained the most information during leaf nitrogen absorption, due to the development of leaf pigments. More pigments imply a larger absorption of the electromagnetic energy in the visible part of the spectrum for photosynthetic use ( Horler et al., 1983; Filella and Penuelas, 1994; Kumar et al., 2001 ), resulting in a decrease in reflectance. On the contrary, wavelengths positioned in the long region of the near-infrared spectrum apparently have most impact for leaf water absorption. The leaf optical properties in this region are driven by the mesophyll structure, dry matter and water content of the leaves ( Jacquemoud et al., 1996 ). Therefore, it is believed that spectral changes due to different water irrigation levels are partially compensated by spectral changes due to structural alterations. Based on the inter-band correlation analysis, a rough spectral region was selected, but we still do not have the exact contribution of each wavelength to the spectral characteristic. We carried out the PCA analysis to compute the principal components by using factor loadings (or eigenvectors) of each wavelength and multiplied the factor loadings by their respective wavelength reflectivity. This showed wavelengths with the highest factor loadings (eigenvectors) and the percentage of variability explained by each principal component. Therefore, the whole range of wavelengths can be reduced to the first few PCs (e.g. PC1–PC5). Two of the most frequently occurring wavelengths in each PC were presented under nitrogen stress and water stress separately ( Table 2 ). The first five PCs, which explained nearly 95% of the variability of the rice full-range spectral energy space, provided the highest factor loadings and were listed from PC1 to PC5. The listed wavelengths indicate the magnitude or ranking for that wavelength based on its factor loadings. The results of PCA analysis of rice spectra under nitrogen stress showed that some wavelengths, such as 585 and 675 nm, were heavily involved in the first two principal components, and had the highest factor loadings in the entire spectral range; that means the low red dominated the PC1 with a 71% frequency of occurrence, and the bands close to the ‘red edge’ in PC2 explained 14% of the variability. Also, the wavelengths at 1175 and 1315 nm provided the best results in the PCA analysis of rice samples with three water irrigations, while the far short-wavelength infrared (FSWIR) bands dominated PC1–PC2, explaining 78% of the variability. These results indicate the importance of low red and NIR wavelengths for nitrogen cultivation, and the FSWIR wavelengths for water irrigation of rice. With the inter-band correlation analysis and PCA analysis, we selected the significant wavelengths and waveband regions that contain rich information on the rice leaf spectra. However, we are still not sure about the direct relation between spectral wavelength and leaf chemical content (such as nitrogen content, chlorophyll content and water content). The results of the PLS weight analysis for each wavelength showed the contribution of each wavelength to the leaf chemical content. We used both the raw full spectra data and the spectra energy space data to establish this relationship. The different components can be defined by their respective scores and loadings. The scores are related to the single samples, while the loadings quantify the contribution of each wavelength to the model. The PLS loading weights (LWs) are loadings-orientated, allowing the optimal fit for the specific rice variable of interest. The LWs related to the three investigated variables (nitrogen content, chlorophyll content and water content) are shown in Fig. 5 a, reflecting the relationship between the performance of spectral wavelengths and the key canopy biochemical content. Based on this figure, it can be concluded that the biochemical contents (which in this paper are nitrogen, water and chlorophyll-a) have impacts on the entire range of the spectrum. However, the nitrogen and chlorophyll-a have very similar PLS factor weighting distributions, with impacts concentrated in the green, red and NIR regions. Water impacts mostly on the FSWIR wavelengths; the visible part of the spectrum tends to change the most during leaf expansion, due to the development of leaf pigments. The exact results are listed in Table 3 , which shows that the central wavelengths of 552, 675 and 775 nm have the greatest PLS loading weights of nitrogen content; the central wavelengths of 556, 660 and 776 nm have the greatest PLS loading weights of chlorophyll-a content; and the wavelengths of 1158, 1378 and 1965 nm have the greatest PLS loading weights of water content. The selected significant wavelengths confirmed our impressions from an initial visual inspection of the spectral curves. From Fig. 5 b we can see that nearly all these selected central wavelengths are close to the significant absorption valleys and reflectance peaks. According to the raw spectral curve of healthy rice leaves, we find that the wavelengths close to 550 and 750 nm correspond to the leaf reflectance characteristics, and wavelengths in the red region, close to 670 nm, correspond to the leaf spectral radiation absorption characteristics. These characteristics coincide quite well with our results from the PLS weight analysis of nitrogen and chlorophyll content. In the same way, the absorption valleys of the raw spectral curve located in the vicinity of 1200, 1400 and 1950 nm are also close to our selected wavelengths at 1158, 1378 and 1965 nm ( Fig. 5 b). The PLS regression method, by selecting wavelengths with the largest loading weight, has great capability to reflect the characteristics of leaf biochemical content. 4.1 Wavelength combinations Considering the redundancy of hyperspectral data and the represented spectrum range selected by the wavelengths or bands, we set a series of wavelength (band) combinations, in order to determine the nitrogen inversion effects of the selected wavelengths. Since the leaf water content affects mainly the long band of the NIR region, expressed as spectral absorption characteristics, we just selected a 3-wavelength combination, according to the result of the PLS loading weight analysis. The eight wavelength combinations are separated into two groups. The first three are broad-band combinations based on the results of PCA analysis and band-band correlation analysis, and the last five combinations are narrow-band combinations based on the results of PLS loading weight analysis. The combination details are as follows: (1) A 215-band combination, dividing the spectrum from 350 to 2500 nm into 215 bands by calculating each 10 nm wavelength average, and thus is similar to the band set of the first space-borne hyperspectral sensor, Hyperion ( Pearlman et al., 2003 ). (2) A 46-band combination, dividing the spectrum from 400 to 850 nm into 46 bands by calculating each 10 nm wavelength average. These bands are mainly in the visible and near-infrared regions, which contain abundant information on vegetation. (3) A 17-band combination. These bands are selected by the inter-band correlation analysis. The central wavelengths are 405, 565, 585, 605, 620, 640, 660, 680, 695, 705, 740, 780, 865, 910, 1085, 1530 and 1960 nm. (4) A 10-wavelength combination: 410, 422, 556, 660, 675, 694, 705, 755, 758 and 776 nm. These wavelengths are chosen according to the PLS loading weight analysis against leaf nitrogen content regression. (5) A 4-wavelength combination: 552, 675, 705 and 776 nm. This combination is chosen from the 10-wavelength combination but is more effective and representative. (6) A 3-wavelength combination: 552, 675 and 776 nm. These three wavelengths have the largest PLS loading weights in the green, red and near-infrared regions, respectively. (7) A 2-wavelength combination: 675 and 776 nm. These two wavelengths adjacent to the ‘red edge’ contain important growing information on vegetation, and it is also quite useful for constituting a narrow-band normalized difference vegetation index (NDVI). (8) A 3-wavelength combination for leaf water content regression: 1158, 1378 and 1965 nm. The eight waveband or wavelength combinations listed have a high level of relevance in providing various vegetation or crop characteristics, as determined through findings from literature, as discussed below. According to existing research, the selected 4-wavelength combination (552, 675, 705 and 776 nm) is of particular relevance. Shibayama and Munakata (1986) established a vegetation index (VI), respectively employing the wavebands at 950/650 and 1100/1200 nm to associate with the dry biomass of the paddy rice canopies. Elvidge and Chen (1995) detected plant stress at red-edge bands centred at 705 and 735 nm. Blackburn (1999) and Thenkabail et al. (2000) found the wavelength around 675 and 680 nm was most strongly correlated with the chlorophyll content of crops or vegetation. Schepers et al. (1996) indicated the strong relationships with total chlorophyll and nitrogen content at 555 nm. A further study by Thenkabail et al. (2004) recommended 22 best narrow bands (10 nm width) in the 350–2500 nm range, to discriminate natural vegetation and crop species. Some researches ( Curran et al., 2001 ) also showed that the four wavebands centred at 1182, 1216, 1936, and 1920 nm were of particular importance for the plant water absorption. The results of each study (wavelength selection) had unique purposes and significance, while our study aimed to select the wavelengths that could recognize and discriminate the paddy rice under different growing stresses. It is necessary to evaluate the performance of the selected waveband (wavelength) combinations. 4.2 Waveband combination regression Each wavelength (band) combination used 120 ( 3 × 4 × 10 ) training samples to establish an inversion model, and the last 60 ( 3 × 4 × 5 ) samples were used for observed data. The correlations between observed values and predicted values of each combination are shown in Fig. 6 . The performance of each combination was compared to that observed in a multivariate calibration based on PLS regression. We took the PCA as a comparison method with the PLS; obviously, PLS has much better regression results. With the energy space transformation of spectral ( A ( λ ) ), PLS regression increased all R 2 values by around 0.02–0.16, compared to raw spectral data ( C ). This spectral transformation model has the most significant improvement when the number of variables is very small, such as the 3-wavelength combination regression of water, where R A ( λ ) 2 increased to 0.715. Evaluated on the basis of the RMSE, which represents the average error, the A ( λ ) PLS model improved the models compared with the raw spectral data ( Table 4 ). The validation was performed on two data sets by comparing differences in R 2 , RMSE and RPD, to estimate the predictive ability of the models. It is shown that the correlation coefficient between the predicted and measured values for the validated samples of rice leaf is high with these eight band (wavelength) combinations. The 215-band combination has the largest coefficient of determination (0.89), and the 2-wavelength combination has the smallest value at 0.68. The narrow-band combinations, which combined 2–10 wavelengths, have regression values from 0.68 to 0.83, indicating good prediction accuracy of the regression models. The 4-wavelength combination with 552, 675, 705 and 776 nm wavelengths just added a 705 nm to the 3-wavelength combination, but obviously increased the coefficient of determination from 0.75 to 0.82, while the 10 narrow-band combination and the 17 broad-band combination provide little improvement to the regression result, with an increase of just 0.01 to around 0.83. However, because of the water loss of leaf samples during our measurement experiment, the 3-wavelength (1158, 1378 and 1965 nm) regression for water content of rice leaves has poor accuracy compared with the narrow-band combinations for leaf nitrogen content regression. These results indicate that the limited narrow-band (wavelength) combination is capable of overcoming the redundancy drawback of hyperspectral data and providing sufficient information on rice, and has great potential for remote sensing applications. 4.3 Discrimination analysis In order to test the classification ability of the narrow-band combinations, without a priori knowledge of the rice growing state, we used the 4-wavelength combination (552, 675, 705 and 776 nm) and the 3-wavelength combination (1158, 1378 and 1965 nm) for the final classification of rice samples of two cases, with four nitrogen cultivation levels and three water irrigation levels generated by LDA. Fig. 7 presents the distribution result of test samples in the discriminant space. The two discriminant functions of LDA are effective in distinguishing paddy leaves in the two cultivation cases. In combination with the LDA classification, the results indicate that our methods for extracting influential narrow bands from the hyperspectral data succeeded in discriminating rice leaves with different growing status. Focusing on the narrow-band combinations, LDA proved an effective procedure for building the best discriminative function. 5 Conclusions Based on these investigations, it was revealed that the narrow-band combinations had a great ability to characterize the rice status, and also had great potential for rice growing environment impact inspection and assessment. By using several parametric and nonparametric methods, a comprehensive comparison was made to select the most influential narrow-band combination (552, 675, 705 and 776 nm) to discriminate rice leaves from four kinds of nitrogen cultivation conditions; also a 3-wavelength combination (1158, 1378 and 1965 nm) was established to enhance spectral discrimination of rice leaves grown in three kinds of irrigation conditions. These selected narrow bands contained the majority of the rice information, in comparison to the performances of other representative band combinations. A further experiment with the narrow-band combinations was applied to the LDA-based classification. The well-discriminated spaces directly testified to the feasibility of these selected narrow bands instead of employing the full range of wavelengths. A reduction in the number of bands, without significant information loss, is important because it makes it possible to achieve fine spatial resolution without sacrificing the ability to characterize the rice status. Most of the hyperspectral studies ( Thenkabail et al., 2000, 2002, 2004; Okin et al., 2001; Hansen and Schjoerring, 2003 ) concluded that less than 30 wavebands are needed to obtain the best crop and vegetation information. The results, when compared with these studies, indicated that the four narrow-bands combination we selected has prominent significance for explaining the characteristics of data. We believe that a small number of narrow bands, which gives access to the influences of environmental and cultivation conditions, is most effective for monitoring and detecting the rice growing status. Although the narrow-band combination for the discrimination of rice in different cultivation conditions was successfully achieved in this study, we need to increase the strength of the linkage between leaf-level and the canopy-level spectral features according to Carter and Estep (2002) and Muttiah (2002) ; more observations have to be investigated for detecting other stresses and more appropriate wavelength selection methods need to be adopted or developed. Finally, the method needs to be applied to more rice varieties, under a range of different stresses. Acknowledgements This work was supported by the Major State Basic Research Development Program 973 Project (2009CB723905) , 863 Project (2009AA12Z107) , 973 Project (2006CB403701) , NSFC ( 10978003 ), NSFC ( 40871171 ), and the Program for New Century Excellent Talents in University ( NCET-07-0629 ). References Aoki et al., 1981 M. Aoki K. Yabuki T. Totsuka An evaluation of chlorophyll content of leaves based on the spectral reflectivity in several plants Research Report of the National Institute of Environmental Studies of Japan 66 1981 125 130 Blackburn, 1999 G.A. Blackburn Relationships between spectral reflectance and pigment concentrations in stacks of deciduous broadleaves Remote Sensing of Environment 70 2 1999 224 237 Brown, 2006 A.J. Brown Spectral curve fitting for automatic hyperspectral data analysis IEEE Transactions on Geoscience and Remote Sensing 44 6 2006 1601 1608 Carter, 1994 G.A. Carter Ratios of leaf reflectances in narrow wavebands as indicators of plant stress International Journal of Remote Sensing 15 3 1994 697 703 Carter and Knapp, 2001 G.A. Carter A.K. Knapp Leaf optical properties in higher plants: linking spectral characteristics to stress and chlorophyll concentration American Journal of Botany 88 4 2001 677 Carter and Estep, 2002 Carter, G.A., Estep, L., 2002. General spectral characteristics of leaf reflectance responses to plant stress and their manifestation at the landscape scale. In: From Laboratory Spectroscopy to Remotely Sensed Spectra of Terrestrial Ecosystems, pp. 271–293. Clark et al., 2005 M.L. Clark D.A. Roberts D.B. Clark Hyperspectral discrimination of tropical rain forest tree species at leaf to crown scales Remote Sensing of Environment 96 3–4 2005 375 398 Coops et al., 2003 N.C. Coops M.L. Smith M.E. Martin S.V. Ollinger Prediction of eucalypt foliage nitrogen content from satellite-derived hyperspectral data IEEE Transactions on Geoscience and Remote Sensing 41 6 2003 1338 1346 Csillag et al., 1993 F. Csillag L. Pásztor L.L. Biehl Spectral band selection for the characterization of salinity status of soils Remote Sensing of Environment 43 3 1993 231 242 Curran et al., 1997 P.J. Curran J.A. Kupiec G.M. Smith Remote sensing the biochemical composition of a slash pine canopy IEEE Transactions on Geoscience and Remote Sensing 35 2 1997 415 420 Curran et al., 2001 P.J. Curran J.L. Dungan D.L. Peterson Estimating the foliar biochemical concentration of leaves with reflectance spectrometry: testing the Kokaly and Clark methodologies Remote Sensing of Environment 76 3 2001 349 359 Elvidge and Chen, 1995 C.D. Elvidge Z. Chen Comparison of broad-band and narrow-band red and near-infrared vegetation indices Remote Sensing of Environment 54 1 1995 38 48 Filella and Penuelas, 1994 I. Filella J. Penuelas The red edge position and shape as indicators of plant chlorophyll content, biomass and hydric status International Journal of Remote Sensing 15 7 1994 1459 1470 Gastellu-Etchegorry and Bruniquel-Pinel, 2001 J.P. Gastellu-Etchegorry V. Bruniquel-Pinel A modeling approach to assess the robustness of spectrometric predictive equations for canopy chemistry Remote Sensing of Environment 76 1 2001 1 15 Gong et al., 1997 P. Gong R. Pu B. Yu Conifer species recognition: an exploratory analysis of in situ hyperspectral data Remote Sensing of Environment 62 2 1997 189 200 Hansen and Schjoerring, 2003 P.M. Hansen J.K. Schjoerring Reflectance measurement of canopy biomass and nitrogen status in wheat crops using normalized difference vegetation indices and partial least squares regression Remote Sensing of Environment 86 4 2003 542 553 Hernandez et al., 1995 J.A. Hernandez E. Olmos F.J. Corpas F. Sevilla L.A. Del Rio Salt-induced oxidative stress in chloroplasts of pea plants Plant Science 105 2 1995 151 167 Horler et al., 1983 D.N.H. Horler M. Dockray J. Barber The red edge of plant leaf reflectance International Journal of Remote Sensing 4 2 1983 273 288 Höskuldsson, 1988 A. Höskuldsson PLS regression methods Journal of Chemometrics 2 3 1988 211 228 Huang and Zhang, 2008 X. Huang L. Zhang An adaptive mean-shift analysis approach for object extraction and classification from urban hyperspectral imagery IEEE Transactions on Geoscience and Remote Sensing 46 12 2008 4173 4185 Huang and Zhang, 2009 X. Huang L. Zhang Evaluation of morphological texture features for mangrove forest mapping and species discrimination using multispectral IKONOS imagery IEEE Transactions on Geoscience and Remote Sensing 6 3 2009 393 397 Jacquemoud et al., 1996 S. Jacquemoud S.L. Ustin J. Verdebout G. Schmuck G. Andreoli B. Hosgood Estimating leaf biochemistry using the PROSPECT leaf optical properties model Remote Sensing of Environment 56 3 1996 194 202 Kleynen et al., 2003 O. Kleynen V. Leemans M.F. Destain Selection of the most efficient wavelength bands for ‘Jonagold’ apple sorting Postharvest Biology and Technology 30 3 2003 221 232 Kumar et al., 2001 L. Kumar K.S. Schmidt S. Dury A.K. Skidmore Imaging spectrometry and vegetation science F.D. van der Meer S.M. de Jong Imaging Spectrometry: Basic Principles and Prospective Applications. Remote Sensing and Digital Image Processing vol. 4 2001 Kluwer Academic Press Dordrecht, Netherlands 111 155 Lyon et al., 1998 J.G. Lyon D. Yuan R.S. Lunetta C.D. Elvidge A change detection experiment using vegetation indices Photogrammetric Engineering & Remote Sensing 64 2 1998 143 150 Maas, 2000 S.J. Maas Linear mixture modeling approach for estimating cotton canopy ground cover using satellite multispectral imagery Remote Sensing of Environment 72 3 2000 304 308 Muttiah, 2002 R.S. Muttiah From Laboratory Spectroscopy to Remotely Sensed Spectra of Terrestrial Ecosystems 2002 Kluwer Academic Publishers Dordrecht, Netherlands Nelson et al., 1996 P.R.C. Nelson P.A. Taylor J.F. MacGregor Missing data methods in PCA and PLS: score calculations with incomplete observations Chemometrics and Intelligent Laboratory Systems 35 1 1996 45 65 Okin et al., 2001 G.S. Okin D.A. Roberts B. Murray W.J. Okin Practical limits on hyperspectral vegetation discrimination in arid and semiarid environments Remote Sensing of Environment 77 2 2001 212 225 Pearlman et al., 2003 J.S. Pearlman P.S. Barry C.C. Segal J. Shepanski D. Beiso S.L. Carman Hyperion, a space-based imaging spectrometer IEEE Transactions on Geoscience and Remote Sensing 41 6 2003 1160 1173 Piech and Piech, 1987 M. Piech K.R. Piech Symbolic representation of hyperspectral data Applied Optics 26 18 1987 4018 4026 Pu et al., 2003 R. Pu S. Ge N.M. Kelly P. Gong Spectral absorption features as indicators of water status in coast live oak ( Quercus agrifolia ) leaves International Journal of Remote Sensing 24 9 2003 1799 1810 Rao and Wu, 2005 C.R. Rao Y. Wu Linear model selection by cross-validation Journal of Statistical Planning and Inference 128 1 2005 231 240 Renzullo et al., 2006 L.J. Renzullo A.L. Blanchfield K.S. Powell A method of wavelength selection and spectral discrimination of hyperspectral reflectance spectrometry IEEE Transactions on Geoscience and Remote Sensing 44 7 2006 1986 1994 Rossman, 1988 G.R. Rossman Vibrational spectroscopy of hydrous components F.C. Hawthorne Mineralogy. Spectroscopic Methods in Mineralogy and Geology 1988 Mineralogical Society of America Reviews 193 206 Sims and Gamon, 2002 D.A. Sims J.A. Gamon Relationships between leaf pigment content and spectral reflectance across a wide range of species, leaf structures and developmental stages Remote Sensing of Environment 81 2–3 2002 337 354 Schepers et al., 1996 J.S. Schepers T.M. Blackmer W.W. Wilhelm M. Resende Transmittance and reflectance measurements of corn leaves from plants with different nitrogen and water supply Journal of Plant Physiology 148 5 1996 523 529 Shibayama and Akiyama, 1989 M. Shibayama T. Akiyama Seasonal visible, near-infrared and mid-infrared spectra of rice canopies in relation to LAI and above-ground dry phytomass Remote Sensing of Environment 27 2 1989 119 127 Shibayama and Munakata, 1986 M. Shibayama K. Munakata A spectroradiometer for field use III: a comparison of some vegetation indices for predicting luxuriant paddy rice biomass Japanese Journal of Crop Science 55 1986 47 52 Thenkabail et al., 2000 P.S. Thenkabail R.B. Smith E. De Pauw Hyperspectral vegetation indices and their relationships with agricultural crop characteristics Remote Sensing of Environment 71 2 2000 158 182 Thenkabail et al., 2002 P.S. Thenkabail R.B. Smith E. De Pauw Evaluation of narrowband and broadband vegetation indices for determining optimal hyperspectral wavebands for agricultural crop characterization Photogrammetric Engineering & Remote Sensing 68 6 2002 607 622 Thenkabail et al., 2004 P.S. Thenkabail E.A. Enclona M.S. Ashton B. Van Der Meer Accuracy assessments of hyperspectral waveband performance for vegetation analysis applications Remote Sensing of Environment 91 3–4 2004 354 376 Van Aardt and Wynne, 2001 J.A.N. Van Aardt R.H. Wynne Spectral separability among six southern tree species Photogrammetric Engineering & Remote Sensing 67 12 2001 1367 1376 Williams and Norris, 2004 P. Williams K. Norris Near-Infrared Technology in the Agricultural and Food Industries 2004 American Association of Cereal Chemists St. Paul, MN Yoder and Pettigrew-Crosby, 1995 B.J. Yoder R.E. Pettigrew-Crosby Predicting nitrogen and chlorophyll content and concentrations from reflectance spectra (400–2500 nm) at leaf and canopy scales Remote Sensing of Environment 53 3 1995 199 211