A comparative study of statistical and machine learning models on near-real-time daily emissions prediction

Xiangqian Li

DOI: https://doi.org/10.48550/arXiv.2302.01152

2023-02-02

Abstract:The rapid ascent in carbon dioxide emissions is a major cause of global warming and climate change, which pose a huge threat to human survival and impose far-reaching influence on the global ecosystem. Therefore, it is very necessary to effectively control carbon dioxide emissions by accurately predicting and analyzing the change trend timely, so as to provide a reference for carbon dioxide emissions mitigation measures. This paper is aiming to select a suitable model to predict the near-real-time daily emissions based on univariate daily time-series data from January 1st, 2020 to September 30st, 2022 of all sectors (Power, Industry, Ground Transport, Residential, Domestic Aviation, International Aviation) in China. We proposed six prediction models, which including three statistical models: Grey prediction (GM(1,1)), autoregressive integrated moving average (ARIMA) and seasonal autoregressive integrated moving average with exogenous factors (SARIMAX); three machine learning models: artificial neural network (ANN), random forest (RF) and long short term memory (LSTM). To evaluate the performance of these models, five criteria: Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE) and Coefficient of Determination () are imported and discussed in detail. In the results, three machine learning models perform better than that three statistical models, in which LSTM model performs the best on five criteria values for daily emissions prediction with the 3.5179e-04 MSE value, 0.0187 RMSE value, 0.0140 MAE value, 14.8291% MAPE value and 0.9844 value.

Artificial Intelligence,Machine Learning,Physics and Society

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is to select a model most suitable for predicting the daily carbon dioxide emissions of various industries in China (electric power, industry, ground transportation, residents, domestic aviation, international aviation) by comparing the performance of statistical models and machine - learning models in near - real - time daily carbon dioxide emission prediction. The paper uses the daily time - series data from January 1, 2020 to September 30, 2022 and proposes six prediction models, including three statistical models (Grey Prediction Model GM(1,1), Autoregressive Integrated Moving Average Model ARIMA, Seasonal Autoregressive Integrated Moving Average Model with Exogenous Factors SARIMAX) and three machine - learning models (Artificial Neural Network ANN, Random Forest RF, Long - Short - Term Memory Network LSTM). The performance of these models is evaluated through five evaluation criteria (Mean Squared Error MSE, Root Mean Squared Error RMSE, Mean Absolute Error MAE, Mean Absolute Percentage Error MAPE, Coefficient of Determination R²), and finally it is determined that the LSTM model performs best on these five evaluation indicators and is especially suitable for near - real - time daily carbon dioxide emission prediction based on long - time - series data. Specifically, the paper aims to: 1. **Improve prediction accuracy**: By comparing the prediction performance of different models, find a model that can provide more accurate predictions to support the formulation of carbon emission mitigation measures. 2. **Fill research gaps**: At present, most of the research on carbon dioxide emissions focuses on annual emission prediction, while this paper focuses on short - cycle daily emission prediction, which is helpful for timely policy response adjustment. 3. **Provide decision - making support**: Through accurate prediction results, provide references for policy - makers to control and reduce carbon dioxide emissions more effectively. The innovation point of this paper lies in using an extended near - real - time carbon dioxide emission data set with a daily frequency to select the most appropriate daily prediction model, which not only improves the prediction accuracy but also provides an opportunity for decision - makers to adjust policies in a timely manner.

A comparative study of statistical and machine learning models on near-real-time daily emissions prediction

Estimating Air Methane and Total Hydrocarbon Concentrations in Alberta, Canada Using Machine Learning

A Machine Learning Approach for Generating and Evaluating Forecasts on the Environmental Impact of the Buildings Sector

Prediction of PM2.5 Concentration Using Spatiotemporal Data with Machine Learning Models

The research model of forecasting agricultural carbon emissions based on ARIMA-LSTM

Quantitative Analysis and Forecasting of Industrial CO2 Emissions using Multiple Machine Learning Models

A daily carbon emission prediction model combining two-stage feature selection and optimized extreme learning machine

Coupling LSTM and CNN Neural Networks for Accurate Carbon Emission Prediction in 30 Chinese Provinces

Accurate and efficient daily carbon emission forecasting based on improved ARIMA

Predicting PM2.5 levels and exceedance days using machine learning methods

Innovative approach to daily carbon dioxide emission forecast based on ensemble of quantile regression and attention BILSTM

Modeling and predicting city-level CO2 emissions using open access data and machine learning

A novel hybrid machine learning model for prediction of CO2 using socio-economic and energy attributes for climate change monitoring and mitigation policies

Evaluation of Machine Learning Models in Air Pollution Prediction for a Case Study of Macau as an Effort to Comply with UN Sustainable Development Goals

An extensive investigation on leveraging machine learning techniques for high-precision predictive modeling of CO2 emission

Neurological differences between paranoid and nonparanoid schizophrenia: part II. computerized tomographic findings.

A machine learning algorithm to explore the drivers of carbon emissions in Chinese cities

Decoupling representation contrastive learning for carbon emission prediction and analysis based on time series

Time series-based PM2.5 concentration prediction in Jing-Jin-Ji area using machine learning algorithm models

Reconstructing Global Daily CO2 Emissions via Machine Learning

[Construction and Analysis of Machine Learning Based Transportation Carbon Emission Prediction Model]