Preserving Full Spectrum Information in Imaging Mass Spectrometry Data Reduction

Roger A.R. Moens,Lukasz G. Migas,Jacqueline M. Van Ardenne,Eric P. Skaar,Jeffrey M. Spraggins,Raf Van de Plas
DOI: https://doi.org/10.1101/2024.09.30.614425
2024-10-01
Abstract:Imaging Mass Spectrometry (IMS) has become an important tool for molecular characterization of biological tissue. However, IMS experiments tend to yield large datasets, routinely recording over 200,000 ion intensity values per mass spectrum and more than 100,000 pixels, i.e., spectra, per dataset. Traditionally, IMS data size challenges have been addressed by feature selection or extraction, such as by peak picking and peak integration. Selective data reduction techniques such as peak picking only retain certain parts of a mass spectrum, and often these describe only medium-to-high-abundance species. Since lower-intensity peaks and, for example, near-isobar species are sometimes missed, selective methods can potentially bias downstream analysis towards a subset of species in the data rather than considering all species measured. Results: We present an alternative to selective data reduction of IMS data that achieves similar data size reduction while better conserving the ion intensity profiles across all recorded m/z-bins, thereby preserving full spectrum information. Our method utilizes a low-rank matrix completion model combined with a randomized sparse-format-aware algorithm to approximate IMS datasets. This representation offers reduced dimensionality and a data footprint comparable to peak picking, but also retains complete spectral profiles, enabling comprehensive analysis and compression. We demonstrate improved preservation of lower signal-to-noise-ratio signals and near-isobars, mitigation of selection bias, and reduced information loss compared to current state-of-the art data reduction methods in IMS.
Bioinformatics
What problem does this paper attempt to address?