Abstract:Abstract Motivation Cross-linking tandem mass spectrometry (XL-MS/MS) is an established analytical platform used to determine distance constraints between residues within a protein or from physically interacting proteins, thus improving our understanding of protein structure and function. To aid biological discovery with XL-MS/MS, it is essential that pairs of chemically linked peptides be accurately identified, a process that requires: (i) database search, that creates a ranked list of candidate peptide pairs for each experimental spectrum and (ii) false discovery rate (FDR) estimation, that determines the probability of a false match in a group of top-ranked peptide pairs with scores above a given threshold. Currently, the only available FDR estimation mechanism in XL-MS/MS is the target-decoy approach (TDA). However, despite its simplicity, TDA has both theoretical and practical limitations that impact the estimation accuracy and increase run time over potential decoy-free approaches (DFAs). Results We introduce a novel decoy-free framework for FDR estimation in XL-MS/MS. Our approach relies on multi-sample mixtures of skew normal distributions, where the latent components correspond to the scores of correct peptide pairs (both peptides identified correctly), partially incorrect peptide pairs (one peptide identified correctly, the other incorrectly), and incorrect peptide pairs (both peptides identified incorrectly). To learn these components, we exploit the score distributions of first- and second-ranked peptide-spectrum matches for each experimental spectrum and subsequently estimate FDR using a novel expectation-maximization algorithm with constraints. We evaluate the method on ten datasets and provide evidence that the proposed DFA is theoretically sound and a viable alternative to TDA owing to its good performance in terms of accuracy, variance of estimation, and run time. Availability and implementation https://github.com/shawn-peng/xlms

Improved detection of differentially abundant proteins through FDR-control of peptide-identity-propagation

A statistical method for the conservative adjustment of false discovery rate (q-value)

ProteinInferencer: Confident protein identification and multiple experiment comparison for large scale proteomics projects

A Three-Stage Search Strategy Combining Database Reduction and Retention Time Filtering to Improve the Sensitivity of Low-Input and Single-Cell Proteomic Analysis.

Assessment of false discovery rate control in tandem mass spectrometry analysis using entrapment

Improved False Discovery Rate Estimation Procedure for Shotgun Proteomics

APIR: Aggregating Universal Proteomics Database Search Algorithms for Peptide Identification with FDR Control

Reinvestigating the Correctness of Decoy-Based False Discovery Rate Control in Proteomics Tandem Mass Spectrometry

[Clinical experiences and experimental results with high-dose intravenous administration of hylase "Dessau"].

Multiple competition-based FDR control for peptide detection

Improvement Of The Quantification Accuracy And Throughput For Phosphoproteome Analysis By A Pseudo Triplex Stable Isotope Dimethyl Labeling Approach

Degps is a Powerful Tool for Detecting Differential Expression in RNA-sequencing Studies

An algorithm for decoy-free false discovery rate estimation in XL-MS/MS proteomics

Interrogating data-independent acquisition LC-MS/MS for affinity proteomics

PepDistiller: A Quality Control Tool to Improve the Sensitivity and Accuracy of Peptide Identifications in Shotgun Proteomics.

How to train a post-processor for tandem mass spectrometry proteomics database search while maintaining control of the false discovery rate

A New Strategy to Filter out False Positive Identifications of Peptides in SEQUEST Database Search Results

Global FDR control across multiple RNAseq experiments

Proteomic stable isotope probing with an upgraded Sipros algorithm for improved identification and quantification of isotopically labeled proteins

Data processing of product ion spectra: Methods to control false discovery rate in compound search results for non-targeted metabolomics

Intrinsic molecular identifiers enable robust molecular counting in single-cell sequencing