A machine learning model for estimating daily maximum 8-hour average ozone concentrations using OMI and MODIS products

Chau-Ren Jung,Wei Chen,Wei-Ting Chen,Shih-Hao Su,Bo-Ting Chen,Ling Chang,Bing-Fang Hwang
DOI: https://doi.org/10.1016/j.atmosenv.2024.120587
IF: 5
2024-05-19
Atmospheric Environment
Abstract:Tropospheric ozone (O 3 ) is a criteria air pollutants posing risks to organisms, and is expected to enhance formation due to climate change. Satellite-based measurements provide a promising approach to estimate ground-level air pollution on large scale. However, most applications of satellite-based measurements have been used for fine particulate matter and nitrogen dioxide, while only a few have been used for O 3 . In this study, we incorporated satellite-based measurements from the Ozone Monitoring Instrument (OMI) and MOderate-resolution Imaging Spectroradiometer (MODIS) with meteorological variables and land-use data to estimate daily maximum 8-hour average O 3 at 1-km resolution in Taiwan during 2004–2020. The random forest model was used to impute the missing values of the satellite-based measurements. Additionally, the XGBoost model was leveraged to estimate daily O 3 concentrations. Model performance was evaluated by the ten-fold cross-validation (CV), temporal and spatial validation, and the results were reported as the coefficient of determination ( R 2 ) and root mean square error (RMSE). Our results showed that the 10-fold CV, temporal validated, and spatial validated R 2 (RMSE) of the XGBoost model were 0.82 (7.71 ppb), 0.63 (11.09 ppb), and 0.68 (10.27 ppb), respectively. Our model performance was better in central and southern Taiwan. The top ten important predictors were date (relative importance = 12.15%), temperature (10.77%), meridional wind (10.71%), relative humidity (9.60%), zonal wind (8.14%), UV radiation (8.07%), total precipitation (6.35%), surface pressure (5.34%), surface O3 volume mixing ratio (4.93%), and boundary layer height (4.69%). The spatial distribution of O 3 estimates showed that daily maximum 8-hour average O 3 concentrations were higher in the suburban and mountainous areas near the central and southern Taiwan. This reveals that sensitive populations should still pay attention to the secondary pollutants even when outside the urban areas. The O 3 estimates can be further leveraged to evaluate the short-term and long-term effects of O 3 on human health.
environmental sciences,meteorology & atmospheric sciences
What problem does this paper attempt to address?