Automated classification pipeline for real-time in vivo examination of colorectal tissue using Raman spectroscopy

Jan Vališ,Markéta Fousková,Daniela Janstová,Lucie Habartová,Jaromír Petrtýl,Luboš Petruželka,Alla Synytsya,Vladimír Setnička
DOI: https://doi.org/10.1016/j.saa.2024.124152
IF: 4.831
2024-03-14
Spectrochimica Acta Part A Molecular and Biomolecular Spectroscopy
Abstract:Colorectal cancer is the third most common malignancy worldwide and one of the leading causes of death in oncological patients with its diagnosis typically involving confirmation by tissue biopsy. In vivo Raman spectroscopy, an experimental diagnostic method less invasive than a biopsy, has shown great potential to discriminate between normal and cancerous tissue. However, the complex and often manual processing of Raman spectra along with the absence of a suitable instant classifier are the main obstacles to its adoption in clinical practice. This study aims to address these issues by developing a real-time automated classification pipeline coupled with a user-friendly application tailored for non-spectroscopists. First, in addition to routine colonoscopy, 377 subjects underwent in vivo acquisitions of Raman spectra of healthy tissue, adenomatous polyps, or cancerous tissue, which were conducted using a custom-made microprobe. The spectra were then loaded into the pipeline and pre-processed in several steps, including standard normal variate transformation and finite impulse response filtration. The quality of the pre-processed spectral data was checked based on their signal-to-noise ratio before the suitable spectra were decomposed and classified using a combination of principal component analysis and a support vector machine, respectively. After five-fold cross-validation, the developed classifier exhibited 100% sensitivity toward adenocarcinoma and adenomatous polyps. The overall accuracy was 96.9% and 79.2% for adenocarcinoma and adenomatous polyps respectively. In addition, an application with a graphical user interface was developed to facilitate the use of our data pipeline by medical professionals in a clinical environment. Overall, the combination of supervised and unsupervised machine learning with algorithmic pre-processing of in vivo Raman spectra appears to be a viable way of reducing the relatively large number of biopsies currently needed to definitively diagnose colorectal cancer.
spectroscopy
What problem does this paper attempt to address?