Precipitation Estimation Using FY-4B/AGRI Satellite Data Based on Random Forest

Yang Huang,Yansong Bao,George P. Petropoulos,Qifeng Lu,Yanfeng Huo,Fu Wang
DOI: https://doi.org/10.3390/rs16071267
IF: 5
2024-04-04
Remote Sensing
Abstract:Precipitation is the basic component of the Earth's water cycle. Obtaining high-resolution and high-precision precipitation data is of great significance. This paper establishes a precipitation retrieval model based on a random forest classification and regression model during the day and at night with FY-4B/AGRI Level1 data on China from July to August 2022. To evaluate the retrieval effect of the model, the GPM IMERG product is used as a reference, and the retrieval results are compared against those of the FY-4B/AGRI operational precipitation product. In addition, the retrieval results are analyzed according to different underlying surfaces. The results showed that compared with the FY-4B/AGRI operational precipitation product, the retrieval model can better identify precipitation and capture precipitation areas of light rain, moderate rain, heavy rain and torrential rain. Among them, the probability of detection (POD) of the day model increased from 0.328 to 0.680, and the equitable threat score (ETS) increased from 0.252 to 0.432. The POD of the night model increased from 0.337 to 0.639, and the ETS score increased from 0.239 to 0.369. Meanwhile, the precipitation estimation accuracy of the day model increased by 38.98% and that of the night model increased by 40.85%. Our results also showed that due to the surface uniformity of the ocean, the model can identify precipitation better on the ocean than on the land. Our findings also indicated that for the different underlying surfaces of the land, there is no significant difference in each evaluation index of the model. This is a strong argument for the universal applicability of the model. Notably, the results showed that, especially for more vegetated areas and areas covered by water, the model is capable of estimating precipitation. In conclusion, the precipitation retrieval model that is proposed herein can better determine precipitation regions and estimate precipitation intensities compared with the FY-4B/AGRI operational precipitation product. It can provide some reference value for future precipitation retrieval research on FY-4B/AGRI.
environmental sciences,imaging science & photographic technology,remote sensing,geosciences, multidisciplinary
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the acquisition of high - resolution and high - precision precipitation data in China. Specifically, the author uses FY - 4B/AGRI satellite data to establish precipitation retrieval models for day and night based on the Random Forest (RF) algorithm. By comparing with FY - 4B/AGRI operational precipitation products and GPM IMERG products, the retrieval effect of the established model is evaluated. The research aims to improve the accuracy of precipitation identification and intensity estimation, especially the identification ability in light rain, moderate rain, heavy rain and torrential rain areas. ### Main research contents and methods 1. **Data sources**: - **FY - 4B/AGRI Level1 data**: FY - 4B is the first operational satellite in China's Fengyun - 4 meteorological satellite series, and its Advanced Geostationary Radiation Imager (AGRI) is one of the main payloads. AGRI has 15 channels, covering a wavelength range of 0.45 - 13.6 µm. - **GPM IMERG product**: Used as reference data, providing multi - satellite - fused precipitation data with high - spatial - temporal resolution. - **FY - 4B/AGRI operational precipitation product**: Pure satellite - estimated precipitation results, not calibrated by ground rain - gauge stations. - **Terrain data**: Using the ETOPO2v2 global terrain model. - **Land - cover - type data**: Using the 30 - meter - resolution global land - cover dynamic monitoring product released by Liu Liangyun's team at the Aerospace Information Research Institute, Chinese Academy of Sciences. 2. **Data pre - processing**: - Match FY - 4B/AGRI Level1 data, FY - 4B/AGRI operational precipitation products, terrain data, land - cover - type data and GPM IMERG products in time and space to establish a spatio - temporal matching data set. - Divide the data set into day (SZ < 85°) and night (SZ ≥ 85°) data sets according to the solar zenith angle (SZ). 3. **Methods**: - **Random forest algorithm**: RF is a classic ensemble learning method that can be used for classification and regression tasks. Different sample data sets are formed by randomly sampling from the original data set, and different decision - tree models are constructed according to these data sets. - **Feature variable selection**: Select physical variables related to the precipitation process as model inputs, mainly including cloud - top height (CTH), cloud - top temperature (CTT), cloud - water path (CWP), cloud - phase state (CP) and water vapor (WV). ### Research results - **Model performance**: - Compared with FY - 4B/AGRI operational precipitation products, the identification ability of the established model in light rain, moderate rain, heavy rain and torrential rain areas is significantly improved. - The probability of detection (POD) of the day - time model is increased from 0.328 to 0.680, and the equitable threat score (ETS) is increased from 0.252 to 0.432. - The POD of the night - time model is increased from 0.337 to 0.639, and the ETS score is increased from 0.239 to 0.369. - The precipitation estimation accuracy of the day - time model is increased by 38.98%, and the precipitation estimation accuracy of the night - time model is increased by 40.85%. - **Analysis of different underlying surfaces**: - The precipitation identification effect of the model on the ocean surface is better than that on the land surface because the ocean surface is more uniform. - For different types of land surfaces, there are no significant differences in the evaluation indicators of the model, indicating that the model has good general applicability. ### Conclusion The precipitation retrieval model based on random forest proposed in this study can more accurately determine precipitation areas and estimate precipitation intensity, especially outstanding in the identification ability in light rain, moderate rain, heavy rain and torrential rain areas. This model provides a valuable reference for future precipitation retrieval research based on FY - 4B/AGRI satellite data.