Prediction of PM2.5 Concentration Level Based on Random Forest and Meteorological Parameters

Cairong REN,Gang XIE
DOI: https://doi.org/10.3778/j.issn.1002-8331.1709-0378
2019-01-01
Abstract:Not only does air pollution, especially PM2.5, do harm to people’s physical and mental health, but it also restricts the economic development of cities. In order to forecast the concentration level of PM2.5 in a convenient and accurate way, a prediction model of concentration level of PM2.5 based on random forest is proposed, the feature factors adopt the meteo-rological data of Taiyuan city from 2013 to 2016, the rule of time sequence of PM2.5 concentration change of the prediction site, and its temporal and spatial correlation with the surrounding sites. Firstly, the K-Means algorithm is applied to cluster the raw meteorological data in order to reduce the correlation between different classifiers. Secondly, the undersampling method is used to balance the dataset so as to reduce the impact of class imbalance on the performance of classifiers. Finally, a predictive model is constructed by using random forest with good generalization ability. By the verification of the real data, the method boasts good recall, precision and F-score in the prediction of the concentration level of PM2.5.
What problem does this paper attempt to address?