Gaussian process regression coupled with mRMR to predict adulterant concentration in cocaine

M J Anzanello,F S Fogliatto,D John,M F Ferrão,R S Ortiz,K C Mariotti
DOI: https://doi.org/10.1016/j.jpba.2024.116294
2024-09-15
Abstract:Street cocaine is often mixed with various substances that intensify its harmful effects. This paper proposes a framework to identify attenuated total reflection Fourier transform infrared spectroscopy (ATR-FTIR) intervals that best predict the concentration of adulterants in cocaine samples. Wavelengths are ranked according to their relevance through ReliefF and mRMR feature selection approaches, and an iterative process removes less relevant wavelengths based on the ranking suggested by each approach. Gaussian Process (GP) regression models are constructed after each wavelength removal and the prediction performance is evaluated using RMSE. The subset balancing a low RMSE value and a small percentage of retained wavelengths is chosen. The proposed framework was validated using a dataset consisting of 345 samples of cocaine with different amounts of levamisole, caffeine, phenacetin, and lidocaine. Averaged over the four adulterants, the GP regression coupled with the mRMR retained 1.07 % of the 662 original wavelengths, outperforming PLS and SVR regarding prediction performance.
What problem does this paper attempt to address?