Compound identification using random projection location-sensitive Hash for gas chromatography-mass spectrometry

Li-Li Cao,Zhi-Shui Zhang,Jun Zhang
DOI: https://doi.org/10.1109/ICSAI.2014.7009416
2014-01-01
Abstract:Generally, some compounds identification methods use presently mass spectra similarity matching that cosine correlation and its composite measure are considered as similar approaches of mass spectra. Currently, several combination similarity measures had a much better performance, especially, Weighted-Cosine (WC) measure. In this work, we introduced random projection location-sensitive hash as a similar algorithm for mass spectrum, and then used it to ascertain compounds along with multiple projections to calculate the average of their hamming distances between binary codes of the replicate data and binary codes of reference data. To prove the performance of this method, the National Institute of Standards and Technology (NIST) mass spectral library was used as the reference database and replicate database was applied as the query data. The experimental results showed that the query and reference spectral using peak intensity weighting always outperform non-weighted the query and reference database. The performance of the random projection location-sensitive hash with repeated projections is almost completely similar to Weighted Cosine(WC)measure which has a supreme accuracy of 84% in similar search matching with the optimal weight factors of (0.53,1.3).
What problem does this paper attempt to address?