CO and NOx emissions prediction in gas turbine using a novel modeling pipeline based on the combination of deep forest regressor and feature engineering

Leandro dos Santos Coelho,Helon Vicente Hultmann Ayala,Viviana Cocco Mariani
DOI: https://doi.org/10.1016/j.fuel.2023.129366
IF: 7.4
2024-01-01
Fuel
Abstract:The main objective of this study is to estimate carbon oxides (CO) and nitrogen oxides (NOx) emissions from a gas turbine using the predictive emission monitoring systems dataset. First, four methods were developed for feature generation: Principal Component Analysis, t-Distributed Stochastic Neighbor Embedding, Uniform Manifold Approximation and Projection, and Potential of Heat-diffusion for Affinity-based Trajectory Embedding. Then Feature Relevance-based Unsupervised Feature Selection was evaluated for ranking features. With all features generated, the regression models Ridge Regression, Least Absolute Shrinkage and Selection Operator, k-Nearest Neighbor, Cubist Regression, Random Forest, Light Gradient Boosting Machine, Categorical Boosting, and Deep Forest Regression (DFR) were evaluated. The hyperparameters were tuned with Randomized Search Cross Validation using a 5-fold cross-validation (CV) procedure. In this paper, an innovative modeling pipeline based on the mentioned regressors, feature engineering, and effective hyperparameters tuning was proposed. The developed regression models based on feature engineering and hyperparameters tuning were compared using performance criteria, including mean 5-fold CV values of the coefficient of determination (R 2) and root mean square error (RMSE). The obtained best results based on mean CV values using the DFR presented R 2 values equal to 0.9647 and 0.5355 and RMSE values equal to 0.4245 and 1.4508 for CO emissions, for the training and validation datasets, respectively. On the other hand, DFR presented R 2 equal to 0.9866 and 0.4737 and RMSE equal to 1.3876 and 7.6528 for NOx emissions, for the training and validation datasets, respectively.The results show that the proposed DFR model has higher potential for CO and NOx emissions prediction and that feature engineering and hyperparameter tuning has demonstrated impacts on the predictive capacity of the evaluated regression models.
energy & fuels,engineering, chemical
What problem does this paper attempt to address?