A Novel Approach for Predicting Anthropogenic CO2 Emissions Using Machine Learning Based on Clustering of the CO2 Concentration

Zhanghui Ji,Hao Song,Liping Lei,Mengya Sheng,Kaiyuan Guo,Shaoqing Zhang
DOI: https://doi.org/10.3390/atmos15030323
IF: 3.11
2024-03-06
Atmosphere
Abstract:The monitoring of anthropogenic CO2 emissions, which increase the atmospheric CO2 concentration, plays the most important role in the management of emission reduction and control. With the massive increase in satellite-based observation data related to carbon emissions, a data-driven machine learning method has great prospects for predicting anthropogenic CO2 emissions. Training samples, which are used to model predictions of anthropogenic CO2 emissions through machine learning algorithms, play a key role in obtaining accurate predictions for the spatial heterogeneity of anthropogenic CO2 emissions. We propose an approach for predicting anthropogenic CO2 emissions using the training datasets derived from the clustering of the atmospheric CO2 concentration and the segmentation of emissions to resolve the issue of the spatial heterogeneity of anthropogenic CO2 emissions in machine learning modeling. We assessed machine learning algorithms based on decision trees and gradient boosting (GBDT), including LightGBM, XGBoost, and CatBoost. We used multiple parameters related to anthropogenic CO2-emitting activities as predictor variables and emission inventory data from 2019 to 2021, and we compared and verified the accuracy and effectiveness of different prediction models based on the different sampling methods of training datasets combined with machine learning algorithms. As a result, the anthropogenic CO2 emissions predicted by CatBoost modeling from the training dataset derived from the clustering analysis and segmentation method demonstrated optimal prediction accuracy and performance for revealing anthropogenic CO2 emissions. Based on a machine learning algorithm using observation data, this approach for predicting anthropogenic CO2 emissions could help us quickly obtain up-to-date information on anthropogenic CO2 emissions as one of the emission monitoring tools.
environmental sciences,meteorology & atmospheric sciences
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is **how to use machine - learning methods, based on the cluster analysis of atmospheric carbon dioxide concentrations, to accurately predict anthropogenic carbon dioxide emissions**. Specifically, the paper focuses on solving the problem of spatial heterogeneity of anthropogenic carbon dioxide emissions by optimizing the sampling methods of training data sets and existing machine - learning algorithms, thereby improving the accuracy and performance of prediction. The research is specifically targeted at China, a high - emission country, aiming to provide an effective method to monitor and control anthropogenic carbon dioxide emissions, in order to support the government in implementing emission - reduction measures and mitigating the impact of global warming.