Principal component analysis-multivariate adaptive regression splines (PCA-MARS) and back propagation-artificial neural network (BP-ANN) methods for predicting the efficiency of oxidative desulfurization systems using ATR-FTIR spectroscopy

Mina Sadrara,Mohammadreza Khanmohammadi Khorrami
DOI: https://doi.org/10.1016/j.saa.2023.122944
2023-11-05
Abstract:Oxidative desulfurization (ODS) of diesel fuels has received attention in recent years due to mild working conditions and effective removal of the aromatic sulfur compounds. There is a need for rapid, accurate, and reproducible analytical tools to monitor the performance of ODS systems. During the ODS process, sulfur compounds are oxidized to their corresponding sulfones which are easily removed by extraction in polar solvents. The amount of extracted sulfones is a reliable indicator of ODS performance, showing both oxidation and extraction efficiency. This article studies the ability of a non-parametric regression algorithm, principal component analysis-multivariate adaptive regression splines (PCA-MARS) as an alternative to back propagation artificial neural network (BP-ANN) to predict the concentration of sulfone removed during the ODS process. Using PCA, variables were compressed to identify principal components (PCs) that best described the data matrix, and the scores of such PCs were used as input variables for the MARS and ANN algorithms. Thecoefficientofdeterminationincalibration (R2c), root mean square error of calibration (RMSEC) and root mean square error of prediction (RMSEP) were calculated for PCA-BP-ANN (R2c = 0.9913, RMSEC = 2.4206 and RMSEP = 5.7124) and PCA-MARS (R2c = 0.9841, RMSEC = 2.7934 and RMSEP = 5.8476) models and were compared with the genetic algorithm partial least squares (GA-PLS) (R2c = 0.9472, RMSEC = 5.5226 and RMSEP = 9.6417) and as the results reveal, both methods are better than GA-PLS in terms of prediction accuracy. The proposed PCA-MARS and PCA-BP-ANN models are robust models that provide similar predictions and can be effectively used to predict sulfone containing samples. The MARS algorithm builds a flexible model using simpler linear regression and is computationally more efficient than BPNN due to data-driven stepwise search, addition, and pruning.
What problem does this paper attempt to address?