BiATNovo: An Attention-based Bidirectional De Novo Sequencing Framework for Data-Independent-Acquisition Mass Spectrometry

Shu Yang,Siyu Wu,Binyang Li,Yuxiaomei Liu,Fangzheng Li,Jiaxing Qi,Qunying Wang,Xiaohui Liang,Tiannan Guo,Zhongzhi Luan
DOI: https://doi.org/10.1101/2023.05.11.540352
2024-10-28
Abstract:De novo sequencing from tandem mass spectra (MS/MS) data is a key technique for identifying novel peptides. In theory, the Data-Independent Acquisition (DIA) method can fragment all precursor ions in an unbiased and non-targeted fashion. However, each spectrum contains fragments from multiple precursor ions, and the unclear relationship between these ions and their fragments poses a significant challenge to the accuracy of de novo sequencing algorithms. Here we present BiATNovo, an attention-based bidirectional de novo peptide sequencing framework. BiATNovo comprises a bidirectional attention-based model and a bidirectional fusion-reranking post-processing module, which enables efficient capture of relationships between tandem mass spectra, fragment ions, and peptide patterns, while also expanding the candidate set to select the optimal sequence. This framework improves peptide prediction accuracy, particularly for long peptide sequences, and mitigates the imbalance where the initial amino acids are predicted more accurately than the last ones. Evaluation results demonstrate that BiATNovo outperforms existing algorithms, including DeepNovo-DIA and PepNet, in both peptid-level and amino acid-level. Furthermore, when extended to DDA datasets, BiATNovo achieves comparable performance to state-of-the-art models.
Bioinformatics
What problem does this paper attempt to address?