Scoring Information Integration with Statistical Quality Control Enhanced Cross-Run Analysis of Data-Independent Acquisition Proteomics Data

Mingxuan Gao,Shubham Gupta,Wenxian Yang,Rongshan Yu,Hannes Rost
DOI: https://doi.org/10.1101/2024.12.19.629475
2024-12-22
Abstract:The peptide-centric strategy is widely applied in data-independent acquisition (DIA) proteomics to analyze multiplexed MS2 spectra. However, current software tools often rely on single-run data for peptide peak identification, leading to inconsistent quantification across heterogeneous datasets. Match-between-runs (MBR) algorithms address this by aligning peaks or elution profiles across runs post-analysis but they are often ad-hoc and lack statistical frameworks for controlling peak quality, resulting in false positives and reduced quantitative reproducibility. Here we present DreamDIAlignR, a cross-run peptide-centric tool that integrates peptide elution behavior across runs with a deep learning peak identifier and signal alignment algorithm for consistent peak picking and FDR-controlled scoring. DreamDIAlignR outperformed state-of-the-art MBR methods, identifying up to 25.6% more quantitatively changing proteins on a benchmark dataset and 38.5% more on a cancer dataset. Additionally, DreamDIAlignR establishes an improved methodology for performing MBR compatible with existing DIA analysis tools, thereby enhancing the overall quality of DIA analysis.
Bioinformatics
What problem does this paper attempt to address?