An Analysis of the Sensitivity of Proteogenomic Mapping of Somatic Mutations and Novel Splicing Events in Cancer
Kelly V Ruggles,Zuojian Tang,Xuya Wang,Himanshu Grover,Manor Askenazi,Jennifer Teubl,Song Cao,Michael D McLellan,Karl R Clauser,David L Tabb,Philipp Mertins,Robbert Slebos,Petra Erdmann-Gilmore,Shunqiang Li,Harsha P Gunawardena,Ling Xie,Tao Liu,Jian-Ying Zhou,Shisheng Sun,Katherine A Hoadley,Charles M Perou,Xian Chen,Sherri R Davies,Christopher A Maher,Christopher R Kinsinger,Karen D Rodland,Hui Zhang,Zhen Zhang,Li Ding,R Reid Townsend,Henry Rodriguez,Daniel Chan,Richard D Smith,Daniel C Liebler,Steven A Carr,Samuel Payne,Matthew J Ellis,David Fenyő
DOI: https://doi.org/10.1074/mcp.m115.056226
2016-01-01
Abstract:Improvements in mass spectrometry (MS)-based peptide sequencing provide a new opportunity to determine whether polymorphisms, mutations, and splice variants identified in cancer cells are translated. Herein, we apply a proteogenomic data integration tool (QUILTS) to illustrate protein variant discovery using whole genome, whole transcriptome, and global proteome datasets generated from a pair of luminal and basal-like breast-cancer-patient-derived xenografts (PDX). The sensitivity of proteogenomic analysis for singe nucleotide variant (SNV) expression and novel splice junction (NSJ) detection was probed using multiple MS/MS sample process replicates defined here as an independent tandem MS experiment using identical sample material. Despite analysis of over 30 sample process replicates, only about 10% of SNVs (somatic and germline) detected by both DNA and RNA sequencing were observed as peptides. An even smaller proportion of peptides corresponding to NSJ observed by RNA sequencing were detected (<0.1%). Peptides mapping to DNA-detected SNVs without a detectable mRNA transcript were also observed, suggesting that transcriptome coverage was incomplete (∼80%). In contrast to germline variants, somatic variants were less likely to be detected at the peptide level in the basal-like tumor than in the luminal tumor, raising the possibility of differential translation or protein degradation effects. In conclusion, this large-scale proteogenomic integration allowed us to determine the degree to which mutations are translated and identify gaps in sequence coverage, thereby benchmarking current technology and progress toward whole cancer proteome and transcriptome analysis.