Decoding the Impact of Neighboring Amino Acid on ESI-MS Intensity Output through Deep Learning

Naim Abdul-Khalek,Reinhard Wimmer,Michael Toft Overgaard,Simon Gregersen Echers
DOI: https://doi.org/10.1101/2024.02.02.578588
2024-02-06
Abstract:Peptide-level quantification using mass spectrometry (MS) is no trivial task as the physicochemical properties affect both response and detectability. The specific amino acid (AA) sequence affects these properties, however the link from sequence to intensity output remains poorly understood. In this work, we explore combinations of amino acid pairs (i.e., dimer motifs) to determine a potential relationship between the local amino acid environment and MS1 intensity. For this purpose, a deep learning (DL) model, consisting of an encoder-decoder with an attention mechanism, was built. The attention mechanism allowed to identify the most relevant motifs. Specific patterns were consistently observed where a bulky/aromatic and hydrophobic AA followed by a cationic AA as well as consecutive bulky/aromatic and hydrophobic AAs were found important for the MS1 intensity. Correlating attention weights to mean MS1 intensities revealed that some important motifs, particularly containing Trp, His, and Cys, were linked with low responding peptides whereas motifs containing Lys and most bulky hydrophobic AAs were often associated with high responding peptides. Moreover, Asn–Gly was associated with low MS1 response. The model could predict MS1 response with a mean average percentage error of ∼11% and a Pearson correlation coefficient of ∼0.68.
Bioinformatics
What problem does this paper attempt to address?