Abstract:The two key steps for analyzing proteomic data generated by high-resolution MS are database searching and postprocessing. While the two steps are interrelated, studies on their combinatory effects and the optimization of these procedures have not been adequately conducted. Here, we investigated the performance of three popular search engines (SEQUEST, Mascot, and MS Amanda) in conjunction with five filtering approaches, including respective score-based filtering, a group-based approach, local false discovery rate (LFDR), PeptideProphet, and Percolator. A total of eight data sets from various proteomes (e.g., E. coli, yeast, and human) produced by various instruments with high-accuracy survey scan (MS1) and high- or low-accuracy fragment ion scan (MS2) (LTQ-Orbitrap, Orbitrap-Velos, Orbitrap-Elite, Q-Exactive, Orbitrap-Fusion, and Q-TOF) were analyzed. It was found combinations involving Percolator achieved markedly more peptide and protein identifications at the same FDR level than the other 12 combinations for all data sets. Among these, combinations of SEQUEST-Percolator and MS Amanda-Percolator provided slightly better performances for data sets with low-accuracy MS2 (ion trap or IT) and high accuracy MS2 (Orbitrap or TOF), respectively, than did other methods. For approaches without Percolator, SEQUEST-group performs the best for data sets with MS2 produced by collision-induced dissociation (CID) and IT analysis; Mascot-LFDR gives more identifications for data sets generated by higher-energy collisional dissociation (HCD) and analyzed in Orbitrap (HCD-OT) and in Orbitrap Fusion (HCD-IT); MS Amanda-Group excels for the Q-TOF data set and the Orbitrap Velos HCD-OT data set. Therefore, if Percolator was not used, a specific combination should be applied for each type of data set. Moreover, a higher percentage of multiple-peptide proteins and lower variation of protein spectral counts were observed when analyzing technical replicates using Percolator-associated combinations; therefore, Percolator enhanced the reliability for both identification and quantification. The analyses were performed using the specific programs embedded in Proteome Discoverer, Scaffold, and an in-house algorithm (BuildSummary). These results provide valuable guidelines for the optimal interpretation of proteomic results and the development of fit-for-purpose protocols under different situations.

Scoring Information Integration with Statistical Quality Control Enhanced Cross-Run Analysis of Data-Independent Acquisition Proteomics Data

ProteinInferencer: Confident protein identification and multiple experiment comparison for large scale proteomics projects

Data-Driven Tool for Cross-Run Ion Selection and Peak-Picking in Quantitative Proteomics with Data-Independent Acquisition LC-MS/MS

Deep Representation Features from DreamDIAXMBD Improve the Analysis of Data-Independent Acquisition Proteomics

A New Evaluation Metric for Quantitative Accuracy of LC-MS/MS-Based Proteomics with Data-Independent Acquisition

Cross-Run Hybrid Features Improve the Identification of Data-Independent Acquisition Proteomics

Software for Quantitative Proteomic Analysis Using Stable Isotope Labeling and Data Independent Acquisition

Assessing the Relationship Between Mass Window Width and Retention Time Scheduling on Protein Coverage for Data-Independent Acquisition

Micro-Data-Independent Acquisition for High-Throughput Proteomics and Sensitive Peptide Mass Spectrum Identification

Interrogating data-independent acquisition LC-MS/MS for affinity proteomics

PeaksDB: New Software for Substantially Improved Peptide Identification from Orbitrap ETD Mass Spectrometry

Accurate Precursor Mass Assignment Improves Peptide Identification in Data-Independent Acquisition Mass Spectrometry.

Optimization of Search Engines and Postprocessing Approaches to Maximize Peptide and Protein Identification for High-Resolution Mass Data

Enhancing peptide identification confidence by combining search methods.

Data‐Independent Acquisition Mass Spectrometry‐Based Proteomics and Software Tools: A Glimpse in 2020

Improving Peptide Identification with Single-Stage Mass Spectrum Peaks.

Increasing the Throughput and Reproducibility of Activity-Based Proteome Profiling Studies with Hyperplexing and Intelligent Data Acquisition

MSSort-DIAXMBD: A Deep Learning Classification Tool of the Peptide Precursors Quantified by OpenSWATH.

Mdfdia: A Mass Defect Based Four-Plex Data-Independent Acquisition Strategy For Proteome Quantification

Mass defect-based carbonyl activated tags (mdCATs) for multiplex data-independent acquisition proteome quantification

Quantitative Chemoproteomic Profiling with Data-Independent Acquisition-Based Mass Spectrometry