Combinatorial Order Pre-processing Search (COPS): A New Pre-processing Strategy for Large-Scale Interpretable Data Analysis in Process Analytical Technologies

Wilson Cardoso,Jussara V. Roque,Jeroen J. Jansen,Sin Yong Teng,Reinaldo F. Teófilo
DOI: https://doi.org/10.1016/j.compchemeng.2024.108892
IF: 4.13
2024-10-13
Computers & Chemical Engineering
Abstract:Combinatorial Order Pre-processing Search (COPS), a novel approach for optimizing data pre-processing is proposed in this work. Unlike simultaneous hyperparameter optimization, COPS employs a priori optimization to reduce computational time while refining the search space for preprocessing sequences and combinations. It allows for setting a maximum number of pre-processing methods, while efficiently searching through combinations of methods with chemically relevant knowledge. In this work, 67 calibration datasets across various analytical techniques, including fluorescence spectroscopy, gas chromatography (GC), near-infrared spectroscopy (NIR), mid-infrared spectroscopy (MID), visible-near-infrared spectroscopy (Vis-NIR), Raman spectroscopy, nuclear magnetic resonance (NMR) spectroscopy, and voltammetry were evaluated. COPS yielded significant improvements over existing methodologies based on design of experiment and compounded pre-processing approaches. The COPS outperformed the other methods, resulting in an average root mean square error of prediction (RMSEP) reduction of 31.7%, while also reduced the complexity (number of latent variables) of the model which allows for easier interpretation. This underscores the importance of combinatorial order set theory for the search of pre-processing method combinations (without fixing the sequence of pre-processing methods) to enhance model performance and interpretation. The novel COPS approach can be employed in process analytical technology (such as inline, online or at-line chemical sensing analytics) to enhance predictive accuracy and operational efficiency, fundamentally transforming the quality and reliability of chemical process monitoring and control.
engineering, chemical,computer science, interdisciplinary applications
What problem does this paper attempt to address?