An intuitive approach for spike removal in Raman spectra based on peaks' prominence and width

Nicolas Coca-Lopez
DOI: https://doi.org/10.1016/j.aca.2024.342312
IF: 6.911
2024-02-03
Analytica Chimica Acta
Abstract:Background Raman spectroscopists are familiar with the challenge of dealing with spikes caused by cosmic rays. These artifacts may lead to errors in subsequent data processing steps, such as for example calibration, normalization or spectral search. Spike removal is therefore a fundamental step in Raman spectral data pre-treatment, but access to publicly accessible code for spike removal tools is limited, and their performance for spectra correction often unknown. Therefore, there is a need for development and testing open-source and easy-to-implement algorithms that improve the Raman data processing workflow. Results In this work, we present and validate two approaches for spike detection and correction in Raman spectral data from graphene: i) An algorithm based on the peaks' widths and prominences and ii) an algorithm based on the ratio of these two peak features. The first algorithm provides an efficient and reliable approach for spike detection in real and synthetic Raman spectra by imposing thresholds on the peaks' width and prominence. The second approach leverages the prominence/width ratio for outlier detection. It relies on the calculation of a limit of detection based on either one or several spectra, enabling the automatic identification of cosmic ray and low-intensity noise-originated spikes alike. Both algorithms detect low-intensity spikes down to at least ≈10% of the highest Raman peak of spectra with different noise levels. To address their limitations and prove their versatility, the algorithms were further tested in Raman spectra from calcite and polystyrene. Significance. Our intuitive, open-source algorithms have been validated and allow automatic correction for a given set of samples. They do not require any pre-processing steps such as calibration or baseline subtraction, and their implementation with Python libraries is computationally efficient, allowing for immediate utilization within existing open-source packages for Raman spectra processing.
chemistry, analytical
What problem does this paper attempt to address?