Combination of Integrated Machine Learning Model Frameworks and Infrared Spectroscopy Towards Fast and Interpretable Characterization of Model Pyrolysis Oil

Chao Chen,Rui Liang,Jingyu Zhu,Junyu Tao,Xuebin Lv,Beibei Yan,Zhanjun Cheng,Guanyi Chen
DOI: https://doi.org/10.1016/j.renene.2024.121434
IF: 8.7
2024-01-01
Renewable Energy
Abstract:Model pyrolysis oil (bio-oil) is a promising bio-fuel with complex and unstable contents, making its fast characterization an essential demand in industrial production. This study proposed an integrated machine learning framework to predict elemental composition, low heating value, and unsaturated concentration from infrared spectra, achieving fast and interpretable characterization of bio-oil. In the integrated framework, a peak loading-based strategy was used to dimensionally reduce the spectral data. Bayesian optimized random forest (RF) and extreme gradient boosting (XGBoost) models were used to predict bio-oil properties from dimensionally reduced spectral data. Ensemble learning was used to combine RF and XGBoost models together for better predicting performance. Results showed that the proposed characterization method achieved an average accuracy of 99.53%, a low RMSE value of 0.726, and an R2 of 0.98. The Shapley value analysis revealed that the vibration of NH2 stretch (1594 cm-1), C-H stretch (2868 cm-1), and C-N stretch in the aromatic ring (1229 cm-1) have a significant contribution to the characterization results. The working mechanism of the proposed characterization method was interpreted by the internal relationship among spectral peak location/height, functional group species/amount, and the predicted characteristics. The results are hoped to serve quality control in production of bio-oil.
What problem does this paper attempt to address?