High-Coverage Four-Dimensional Data-Independent Acquisition Proteomics and Phosphoproteomics Enabled by Deep Learning-Driven Multi-Dimensional Prediction

Moran Chen,Pujia Zhu,Pengfei Wu,Yanhong Hao,Zhourui Zhang,Jian Sun,Wenjing Nie,Suming Chen
DOI: https://doi.org/10.1101/2022.06.12.495786
2022-06-16
bioRxiv
Abstract:Four-dimensional (4D) data-independent acquisition (DIA)-based proteomics is an emerging technology that has been proven to have high precursor ion sampling efficiency and higher precursor identification specificity. However, the current 4D DIA proteomics is still dependent on the building of project-specific experimental library which is time-consuming and limits the coverage for identification/quantification. Herein, a workflow of 4D DIA proteomics by using the predicted multi-dimensional in silico library was established. A deep learning model Deep4D that could high-accurately predict the CCS and RT of both the unmodified and phosphorylated peptides was developed. By using an integrated 4D in silico library containing millions of peptides, we have identified 25% more protein than using experimental libraries in the DIA proteomics analysis of HeLa cells. We further demonstrate that the introduction of in silico prediction library can greatly complement the experimental library of directly obtained phosphorylated peptides, resulting in a greater increase in the identification of phosphorylated peptides and phosphorylated proteins.
What problem does this paper attempt to address?