Interpretable reconstruction of naphtha components using property-based extreme gradient boosting and compositional-weighted Shapley additive explanation values

Yi Shi,Weimin Zhong,Xin Peng,Minglei Yang,Wei Du
DOI: https://doi.org/10.1016/j.ces.2023.119462
IF: 4.7
2024-02-01
Chemical Engineering Science
Abstract:Various methods exist for reconstructing the molecular composition of petroleum feedstocks from their bulk properties. While data-driven approaches are precise and efficient, they often lack mechanistic insight. This paper presents an interpretable, data-driven model for naphtha composition reconstruction. Utilizing a property-based Extreme Gradient Boosting (XGBoost) model, optimized with the Tree Parzen Estimator (TPE) and property mixing rules, we achieve notable accuracy. The model leverages Shapley Additive Explanations (SHAP) to elucidate the influence of each property on specific compositions. Moreover, we introduce a compositional-weighted SHAP metric, revealing overarching molecular distribution patterns. Our analyses show that PIONA values and boiling points have a more pronounced effect on molecular compositions than other examined properties. Finally, the SOL-CNN model is employed for accurate property prediction of predefined components.
engineering, chemical
What problem does this paper attempt to address?