Phenotype Prediction using a Tensor Representation and Deep Learning from Data Independent Acquisition Mass Spectrometry

Fangfei Zhang,Shaoyang Yu,Lirong Wu,Zelin Zang,Xiao Yi,Jiang Zhu,Cong Lu,Ping Sun,Yaoting Sun,Sathiyamoorthy Selvarajan,Lirong Chen,Xiaodong Teng,Yongfu Zhao,Guangzhi Wang,Junhong Xiao,Shiang Huang,Oi Lian Kon,N. Gopalakrishna Iyer,Stan Z. Li,Zhongzhi Luan,Tiannan Guo
DOI: https://doi.org/10.1101/2020.03.05.978635
2020-01-01
Abstract:A novel approach for phenotype prediction is developed for mass spectrometric data. First, the data-independent acquisition (DIA) mass spectrometric data is converted into a novel file format called “DIA tensor” (DIAT) which contains all the peptide precursors and fragments information and can be used for convenient DIA visualization. The DIAT format is fed directly into a deep neural network to predict phenotypes without the need to identify peptides or proteins. We applied this strategy to a collection of 102 hepatocellular carcinoma samples and achieved an accuracy of 96.8% in classifying malignant from benign samples. We further applied refined model to 492 samples of thyroid nodules to predict thyroid cancer; and achieved a predictive accuracy of 91.7% in an independent cohort of 216 test samples. In conclusion, DIA tensor enables facile 2D visualization of DIA proteomics data as well as being a new approach for phenotype prediction directly from DIA-MS data.
What problem does this paper attempt to address?