Hyperspectral unmixing for Raman spectroscopy via physics-constrained autoencoders

Dimitar Georgiev,Álvaro Fernández-Galiana,Simon Vilms Pedersen,Georgios Papadopoulos,Ruoxiao Xie,Molly M. Stevens,Mauricio Barahona
2024-03-07
Abstract:Raman spectroscopy is widely used across scientific domains to characterize the chemical composition of samples in a non-destructive, label-free manner. Many applications entail the unmixing of signals from mixtures of molecular species to identify the individual components present and their proportions, yet conventional methods for chemometrics often struggle with complex mixture scenarios encountered in practice. Here, we develop hyperspectral unmixing algorithms based on autoencoder neural networks, and we systematically validate them using both synthetic and experimental benchmark datasets created in-house. Our results demonstrate that unmixing autoencoders provide improved accuracy, robustness and efficiency compared to standard unmixing methods. We also showcase the applicability of autoencoders to complex biological settings by showing improved biochemical characterization of volumetric Raman imaging data from a monocytic cell.
Machine Learning,Artificial Intelligence,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is hyperspectral unmixing in Raman spectroscopy. Specifically, the author developed a hyperspectral unmixing algorithm based on auto - encoder neural network and verified the effectiveness of these algorithms through synthetic data sets and experimental data sets. Traditional chemometric methods often perform poorly when dealing with complex mixture scenarios, and the method proposed in this paper aims to improve the accuracy, robustness and efficiency of unmixing, especially for applications in complex biological environments. ### Background and Problem Description Raman spectroscopy is a widely used technique in the scientific field for characterizing the chemical composition of samples in a non - destructive and label - free manner. Many application scenarios require unmixing from the mixed signals of molecular species to identify the individual components present and their proportions. However, traditional methods often perform poorly when dealing with the complex mixture scenarios encountered in practice. ### Solution The author developed a hyperspectral unmixing algorithm based on auto - encoder (AE) neural network and systematically verified it using synthetic data sets and experimental data sets. The main contributions include: 1. **Design of Auto - encoder**: - **Encoder**: Transforms the input spectrum into a low - dimensional latent space representation. - **Decoder**: Reconstructs the original input spectrum from the latent space representation. The decoder can be designed as a linear or nonlinear mixing model to adapt to different application scenarios. 2. **Introduction of Physical Constraints**: - Introduce the constraints of non - negativity and the sum of abundances being one to ensure the physical meaning of the unmixing results. 3. **Performance Verification**: - Use synthetic data sets and experimental data sets to verify the performance of the auto - encoder in the unmixing task. The results show that it is superior to traditional unmixing methods in terms of accuracy, robustness and efficiency. 4. **Biological Applications**: - Demonstrate the application of the auto - encoder in complex biological environments, especially in the biochemical characterization of volume Raman imaging data of monocytes. ### Main Results - **Benchmark Tests on Synthetic Data Sets**: - The auto - encoder exhibits excellent performance when dealing with both linear and nonlinear mixtures, especially in the case of more noise and data artifacts. - In terms of computational efficiency, the auto - encoder is faster than the traditional N - FINDR+FCLS and VCA+FCLS methods. - **Verification on Experimental Data Sets**: - On the experimental data of sugar solution mixtures, the auto - encoder exhibits better unmixing performance under different signal - to - noise ratio conditions. - In the volume Raman imaging data of monocytes, the auto - encoder can more accurately identify various biochemical components in the cell, such as DNA, protein, triglyceride, phospholipid and cholesterol ester. ### Conclusion The auto - encoder - based hyperspectral unmixing method proposed in this paper shows significant advantages when dealing with complex mixture scenarios, with high accuracy and robustness. This method is not only applicable to Raman spectroscopy, but may also be extended to other spectroscopic modalities, such as infrared spectroscopy. Future research directions include exploring more complex auto - encoder architectures and training objectives, and using auto - encoder as a pre - training step in downstream tasks.