Relationship between Training Sample Size and Rice Mapping Accuracy Using Sentinels 1 and 2

Barideh, Rahman,Nasimi, Fereshteh
DOI: https://doi.org/10.1007/s12524-024-02020-y
IF: 1.894
2024-10-06
Journal of the Indian Society of Remote Sensing
Abstract:In the last few decades, with the progress of science and the variety of optical and radar satellites, it has become possible to monitor and classify crops on a large scale. However, one of the main challenges in satellite imagery classification is the number of required training samples. Therefore, this study has been conducted to investigate the relationship between the training sample size and the classification accuracy of rice and non-rice covers. For this purpose, Sentinel-1 and Sentinel-2 images, Random Forest classifier (RF), and Google Earth Engine were used. In total, 2,500 rice samples and 9,500 non-rice samples were prepared in the study area, and 100 different runs were performed with varying training sample sizes. The results showed that the backscatter time series of Sentinel-1 images make it possible to distinguish rice from non-rice covers with high accuracy. So, the inputs to the Random Forest classifier included the Backscatter Slope (BS), the Backscatter Difference (BD) between the maximum and minimum values of the time series, and the Normalized Difference Vegetation Index (NDVI). According to the results, there is a non-linear relationship between the increase in the training sample size and the classification accuracy i.e., the accuracy decreases with the increase in the number of samples. The highest overall accuracy and kappa coefficient were obtained using 50% of the training samples. Therefore, with this sample size, the rice cultivation map was determined and the rice and non-rice areas were obtained 158,384 ha and 2,225,816 ha, respectively. The highest overall accuracy (89%) and kappa coefficient (0.86) were obtained when one training sample per 181 ha of rice fields and one training sample per 669 ha of non-rice cover were used. By doubling the number of samples, the kappa coefficient and overall accuracy were 0.84 and 87%, respectively. Finally, according to the results, increasing the number of training samples does not necessarily increase the classification accuracy.
environmental sciences,remote sensing
What problem does this paper attempt to address?