Protein Signatures for Classification and Prognosis of Gastric Cancer
Daguang Wang,Fei Ye,Yabin Sun,Wei Li,Hongyi Liu,Jing Jiang,Yang Zhang,Chengkui Liu,Wei Tong,Ling Gao,Yunguang Sun,Weijia Zhang,Terry SeeToe,Peng Lee,Jian Suo,David Y. Zhang
DOI: https://doi.org/10.1016/j.ajpath.2011.06.010
2011-01-01
Abstract:Current methods have limited accuracy in predicting survival and stratifying patients with gastric cancer for appropriate treatment. We sought to identify protein signatures of gastric cancer for classification and prognostication. The Protein Pathway Array (initial study) and Western blot (confirmation) were used to assess the protein expression in a total of 199 fresh frozen gastric samples. There were 56 paired samples divided into a training set (n = 37) and a validation set (n = 19) for the identification of differentially expressed proteins between tumor and normal tissues. There were 56 tumor samples used to identify proteins correlating with tumor and nodal staging. All 93 tumor samples were used to identify candidate proteins for predicting survival. We confirmed the survival prediction of the candidate proteins by using an additional cohort of gastric cancer samples (n = 50). There were 22 proteins differentially expressed between normal and tumor tissues. Nine proteins were selected to build the predictor to classify normal and tumor samples. Ten proteins were differentially expressed among different T stages and four of these were associated with invasive behavior. An additional four proteins were associated with lymph node metastasis. Two proteins were identified as independent risk factors for overall survival. This study indicated that some dysregulated signaling proteins could be selected as useful biomarkers for tumor classification and predicting outcome in gastric cancer patients. Current methods have limited accuracy in predicting survival and stratifying patients with gastric cancer for appropriate treatment. We sought to identify protein signatures of gastric cancer for classification and prognostication. The Protein Pathway Array (initial study) and Western blot (confirmation) were used to assess the protein expression in a total of 199 fresh frozen gastric samples. There were 56 paired samples divided into a training set (n = 37) and a validation set (n = 19) for the identification of differentially expressed proteins between tumor and normal tissues. There were 56 tumor samples used to identify proteins correlating with tumor and nodal staging. All 93 tumor samples were used to identify candidate proteins for predicting survival. We confirmed the survival prediction of the candidate proteins by using an additional cohort of gastric cancer samples (n = 50). There were 22 proteins differentially expressed between normal and tumor tissues. Nine proteins were selected to build the predictor to classify normal and tumor samples. Ten proteins were differentially expressed among different T stages and four of these were associated with invasive behavior. An additional four proteins were associated with lymph node metastasis. Two proteins were identified as independent risk factors for overall survival. This study indicated that some dysregulated signaling proteins could be selected as useful biomarkers for tumor classification and predicting outcome in gastric cancer patients. Gastric cancer is the fourth most common malignancy and ranked as the second leading cause of cancer death worldwide.1Parkin D.M. Bray F. Ferlay J. Pisani P. Global cancer statistics, 2002.CA Cancer J Clin. 2005; 55: 74-108Crossref PubMed Scopus (17300) Google Scholar The geographic distribution of incidence and mortality of gastric cancer varies remarkably worldwide. Areas with high incidence include Japan, Korea, China, Eastern Europe, and parts of Latin America. The mortality of gastric cancer has declined in past decades, mainly due to early detection by gastric endoscopy.2Tsugane S. Sasazuki S. Diet and the risk of gastric cancer: review of epidemiological evidence.Gastric Cancer. 2007; 10: 75-83Crossref PubMed Scopus (347) Google Scholar However, unlike that of other common cancers, the prognosis for most gastric cancer is poor and has improved little for the past several decades. Despite recent advances in chemotherapy and surgical techniques, the overall 5-year survival rate is lower than 40%.3Jemal A. Siegel R. Ward E. Hao Y. Xu J. Murray T. Thun M.J. Cancer statistics, 2008.CA Cancer J Clin. 2008; 58: 71-96Crossref PubMed Scopus (10195) Google Scholar Perplexingly, the prognosis varies widely in patients with stage II or III disease for undetermined biological reasons. Currently, prognosis of gastric cancer is based on pathology (ie, histology type, invasion, and metastasis), radiological imaging (for staging), and other clinical factors (age and comorbidity), which all determine how patients should be managed (surgery and subsequent chemotherapy). However, these traditional clinicopathological factors have significant limitations. Therefore, a large effort has been made to search for molecular markers for diagnosis, classification, and prognosis of gastric cancer.4Mori M. Mimori K. Shiraishi T. Tanaka S. Ueo H. Sugimachi K. Akiyoshi T. p27 expression and gastric carcinoma.Nat Med. 1997; 3: 593Crossref PubMed Scopus (265) Google Scholar, 5Akama Y. Yasui W. Yokozaki H. Kuniyasu H. Kitahara K. Ishikawa T. Tahara E. Frequent amplification of the cyclin E gene in human gastric cancer.Jpn J Cancer Res. 1995; 86: 617-621Crossref PubMed Scopus (145) Google Scholar, 6Graziano F. Mandolesi A. Ruzzo A. Bearzi I. Testa E. Arduini F. Silva R. Muretto P. Mari D. Berardi R. Scartozzi M. Lai V. Cascinu S. Magnani M. Predictive and prognostic role of E-cadherin protein expression in patients with advanced gastric carcinomas treated with palliative chemotherapy.Tumour Biol. 2004; 25: 106-110Crossref PubMed Scopus (18) Google Scholar, 7Sanz-Ortega J. Steinburg S.M. Moro E. Saez M. Lopez J.A. Sierra E. Sanz-Esponera J. Merino M.J. Comparative study of tumor angiogenesis and immunohistochemistry for p53, c-erbB2, c-myc and EGFr as prognostic factors in gastric cancer.Histol Histopathol. 2000; 15: 455-462PubMed Google Scholar For example, cell cycle regulation factors (p27 and cyclin E),4Mori M. Mimori K. Shiraishi T. Tanaka S. Ueo H. Sugimachi K. Akiyoshi T. p27 expression and gastric carcinoma.Nat Med. 1997; 3: 593Crossref PubMed Scopus (265) Google Scholar, 5Akama Y. Yasui W. Yokozaki H. Kuniyasu H. Kitahara K. Ishikawa T. Tahara E. Frequent amplification of the cyclin E gene in human gastric cancer.Jpn J Cancer Res. 1995; 86: 617-621Crossref PubMed Scopus (145) Google Scholar cell adhesion molecules (E-cadherin),6Graziano F. Mandolesi A. Ruzzo A. Bearzi I. Testa E. Arduini F. Silva R. Muretto P. Mari D. Berardi R. Scartozzi M. Lai V. Cascinu S. Magnani M. Predictive and prognostic role of E-cadherin protein expression in patients with advanced gastric carcinomas treated with palliative chemotherapy.Tumour Biol. 2004; 25: 106-110Crossref PubMed Scopus (18) Google Scholar oncogenes (c-erbB2 and c-myc),7Sanz-Ortega J. Steinburg S.M. Moro E. Saez M. Lopez J.A. Sierra E. Sanz-Esponera J. Merino M.J. Comparative study of tumor angiogenesis and immunohistochemistry for p53, c-erbB2, c-myc and EGFr as prognostic factors in gastric cancer.Histol Histopathol. 2000; 15: 455-462PubMed Google Scholar and tumor suppressor genes (p53)7Sanz-Ortega J. Steinburg S.M. Moro E. Saez M. Lopez J.A. Sierra E. Sanz-Esponera J. Merino M.J. Comparative study of tumor angiogenesis and immunohistochemistry for p53, c-erbB2, c-myc and EGFr as prognostic factors in gastric cancer.Histol Histopathol. 2000; 15: 455-462PubMed Google Scholar have been reported to correlate with the prognosis of gastric cancer patients. Despite these reports, inconsistent results exist among the different studies, and the reported parameters provide limited information on the prognosis of individual patients because of the complex biology of the disease.8Zheng L. Wang L. Ajani J. Xie K. Molecular basis of gastric cancer development and progression.Gastric Cancer. 2004; 7: 61-77Crossref PubMed Scopus (163) Google Scholar In this study, we attempted to screen for proteins that can be used for diagnosis and prognosis of gastric cancer using the Protein Pathway Array method, a multiplex immunoblot-based assay combined with computational analysis.9Zhang D.Y. Ye F. Gao L. Liu X. Zhao X. Che Y. Wang H. Wang L. Wu J. Song D. Liu W. Xu H. Jiang B. Zhang W. Wang J. Lee P. Proteomics, pathway array and signaling network-based medicine in cancer.Cell Div. 2009; 4: 20-36Crossref PubMed Scopus (48) Google Scholar The Protein Pathway Array is a novel proteomic method that can characterize hundreds of proteins in clinical samples and identify alterations in protein expression or abundance with biomarker potential. We applied this unique approach to identify differentially expressed signal transduction proteins in gastric cancer tissue. Because the dysregulation of signal transduction proteins is responsible for cancer development, these proteins can be used as a signature for the diagnosis and prognosis of gastric cancer. Using this approach, we successfully identified a panel of nine proteins for distinguishing gastric cancer, four proteins associated with invasion, and two proteins for prognosis of survival. Fifty-six pairs of gastric cancer and adjacent nontumor mucosa (37 in the training set and 19 in the validation set), and an additional 87 cancer tissues (37 in the additional set and 50 in the second cohort) (Figure 1) were obtained after informed consent from patients who underwent D2 gastrectomy (ie, radical gastrectomy with level 2 extended lymphadenectomy) between February 2008 and June 2009 at The First Hospital of Jilin University, Jilin, China. This study was reviewed and approved by The First Hospital of Jilin University's Institution Ethical Review Boards. The representative tumors and adjacent normal tissues of these patients were dissected and frozen within 30 minutes of removal in a liquid nitrogen tank after immediate pathological examination. Tumor samples of 3 × 3 × 5 mm3 were taken from areas without gross necrosis. Adjacent nontumor mucosa samples of 3 × 3 × 5 mm3 were taken from the same patient at 3 cm away from the tumor margin. The tumor samples did not contain normal mucosal tissue, except for occasional entrapped gastric glands. The mucosa samples contained mucosa and a part of adherent submucosa; neither tumor nor dysplasia was included.9Zhang D.Y. Ye F. Gao L. Liu X. Zhao X. Che Y. Wang H. Wang L. Wu J. Song D. Liu W. Xu H. Jiang B. Zhang W. Wang J. Lee P. Proteomics, pathway array and signaling network-based medicine in cancer.Cell Div. 2009; 4: 20-36Crossref PubMed Scopus (48) Google Scholar The clinicopathological data of the patients are summarized in Table 1. A total of 143 patients (137 advanced and 6 early gastric cancers) were included (93 initial samples and 50 second cohort samples). One hundred and twenty patients had regional lymph node metastasis and one patient had distant metastasis (liver) at the surgery. The TNM stage of the tumor was done according to the American Joint Committee on Cancer.10Mullaney P.J. Wadley M.S. Hyde C. Wyatt J. Lawrence G. Hallissey M.T. Fielding J.W. Appraisal of compliance with the UICC/AJCC staging system in the staging of gastric cancer Union Internacional Contra la Cancrum/American Joint Committee on Cancer.Br J Surg. 2002; 89: 1405-1408Crossref PubMed Scopus (56) Google ScholarTable 1Patient Demographics and Gastric Cancer CharacteristicsClinicopathological characteristicsPatient number (%)First cohortSecond cohort(n = 93)(n = 50)Age ≤60 years38 (41)17 (34) >60 years55 (59)33 (66)Sex Male73 (78)24 (48) Female20 (22)26 (52)Family history Yes12 (13)13 (26) No81 (87)37 (74)Histology Histological grade Moderately differentiated adenocarcinoma30 (32)15 (30) Poorly differentiated adenocarcinoma63 (68)35 (70)Vascular invasion Yes60 (65)31 (62) No33 (35)19 (38)AJCC TNM stage⁎According to the American Joint Committee on Cancer (AJCC).10 I15 (16)0 (0) II16 (17)10 (20) III39 (42)39 (78) IV23 (25)1 (2)Primary tumor T16 (6)0 (0) T219 (20)0 (0) T364 (69)50 (100) T44 (4)0 (0)Node status N023 (24)0 (0) N126 (28)5 (10) N222 (24)35 (70) N322 (24)10 (20)Metastasis M092 (99)50 (100) M1†Metastasis to liver.1 (1)0 (0) According to the American Joint Committee on Cancer (AJCC).10Mullaney P.J. Wadley M.S. Hyde C. Wyatt J. Lawrence G. Hallissey M.T. Fielding J.W. Appraisal of compliance with the UICC/AJCC staging system in the staging of gastric cancer Union Internacional Contra la Cancrum/American Joint Committee on Cancer.Br J Surg. 2002; 89: 1405-1408Crossref PubMed Scopus (56) Google Scholar† Metastasis to liver. Open table in a new tab The proteins from 199 samples (56 paired samples and 87 unpaired tumors) were extracted, with 149 of them being used to assess the level of protein expression and phosphorylation using the Protein Pathway Array, and 50 of them being used to detect the expression levels of two candidate proteins using Western blot (Figure 1). Fifty six paired tumors and adjacent normal tissues were used to select the protein panel to distinguish between normal and tumor tissues of gastric cancer. These 56 pairs of samples were divided into a training set (n = 37) and a validation set (n = 19). A total of 56 tumor samples (including 19 tumor samples from the validation set and additional 37 new tumor samples) were used to identify the protein panel to distinguish different TNM stages. All 93 tumor samples were used to assess the candidate proteins for predicting survival. An additional cohort of gastric cancer samples (n = 50) were used to confirm the ability of candidate proteins to predict survival. Total proteins were extracted from the 149 fresh frozen gastric samples using 1× sample lysis buffer (Cell Signaling Technology, Danvers, MA) containing 20 mmol/L Tris-HCL (pH 7.5), 150 mmol/L NaCL, 1 mmol/L Na2EDTA, 1 mmol/L EGTA, 1% Triton, 2.5 mmol/L sodium pyrophosphate, 1 mmol/L β-glycerophosphate, 1 mmol/L Na3VO4, and 1 μg/mL leupeptin in the presence of 1× proteinase inhibitor cocktail (Roche Applied Science, Indianapolis, IN) and 1× phosphatase inhibitor cocktail (Roche Applied Science). The lysate was sonicated 3 times for 15 seconds each, and then centrifuged at 14,000 rpm for 30 minutes at 4°C. The protein concentration was determined with the BCA Protein Assay kit (Pierce, Rockford, IL). Approximately 300 μg of protein lysate was loaded in one well across the entire width of 10% SDS polyacrylamide and separated by electrophoresis, as previously described.11Ye F. Che Y. McMillen E. Gorski J. Brodman D. Saw D. Jiang B. Zhang D.Y. The effect of Scutellaria Baicalensis on the signaling network in hepatocellular carcinoma cells.Nutr Cancer. 2009; 61: 530-537Crossref PubMed Scopus (57) Google Scholar After electrophoresis, the proteins were transferred electrophoretically to a nitrocellulose membrane (Bio-Rad, Hercules, CA), which was then blocked for 1 hour with blocking buffer including either 5% milk or 3% bovine serum albumin in 1× Tris-HCI, NaCl, and Tween 20 (TBST) containing 20 mmol/L Tris-HCl (pH 7.5), 100 mmol/L NaCl, and 0.1% Tween-20. Next, the membrane was clamped on a Western blotting manifold (Mini-PROTEAN II Multiscreen Apparatus, Bio-Rad, Hercules, CA) that isolates 20 channels across the membrane. The multiplex immunoblot was performed using a total of 142 protein-specific or phosphorylation site-specific antibodies (Table 2). Four sets of antibodies (a total of 36 to 38 protein-specific or phosphorylation site-specific antibodies per set) were individually used for each membrane, and all of the antibodies (from various companies) were validated independently before inclusion in the Protein Pathway Array. For the first set of 36 primary antibodies, a mixture of two antibodies in the blocking buffer were added to each channel and then incubated at 4°C overnight. The membrane was then washed with 1× Tris-buffered saline and 1× TBST, and was further incubated with secondary anti-rabbit or anti-mouse antibody conjugated with horseradish peroxidase (Bio-Rad) for 1 hour at room temperature. The membrane was developed with chemiluminescence substrate (Immun-Star HRP Peroxide Buffer/Immun-Star HRP Luminol Enhancer, Bio-Rad), and chemiluminescent signals were captured using the ChemiDoc XRS System (Bio-Rad). The same membrane was then stripped off using stripping buffer (Restore Western Blot Stripping Buffer, Thermo Scientific, Rockford, IL) and then used to detect a second set of 36 primary antibodies as previously described. The signal of each protein were determined by densitometric scanning (Quantity One software package, Bio-Rad).Table 2List of Antibodies Included in the Protein Pathway ArrayAntibodies specific for phosphorylation p-PKCα(Ser657), p-EGFR (Tyr1068), p-HER2/ERBB2 (Tyr1221/1222), p-PDK1 (Ser241), p-PKCα/βII (Thr638/641), p-p53 (Ser392), p-Akt (Ser473), p-PTEN (Ser380), p-Rb (Ser780), p-survivin (Thr 34), p-beta-catenin (Ser33/37/Thr41), p-STAT5 (Tyr694), p-STAT3 (Ser727), p-ERK (Thr202/Tyr204), p-p70 S6 kinase (Thr389), p-VEGFR-2 (Tyr951), p-FGFR (Tyr653/654), p-EIF4B (Ser422), p-HGFR/C-Met (Y1234/Y1235), p-Smad (Ser463/465), p-ERK5 (Thr218/Tyr220), p-p90RSK (Ser380), p-CREB (Ser133), p-mTOR (Ser2448), p-PKCδ(Thr505), p-CDC2 (Tyr15), p-c-Jun (Ser73), p-SAPK/JNK (Thr183/Tyr185), p-FLT3 (Tyr 591), p-p38 (Thr180/Tyr182), p-GSK-3α/β(Ser21/9), p-FAK (Tyr397), p-RB (Ser807/811), p-HGFR/C-Met (Y1003).Antibodies for signal transduction proteins CyclinB1, cyclinD1, CDK6, CDC25B, cyclinE, CDK2, p27, BRCA1, CDK4, neu, 14-3-3 beta, cPKCα, ERK, EGFR, WEE1, CDC25C, HSP90, CHK1, MDM2, CDC2 p34, E2F-1, PCNA, c-myc, Notch1, beta-catenin, Akt, Trap, XIAP, Bcl-2, ETS1, HIF-1α, HIF-2α, TTF-1, p53, Notch4, PTEN, SRC-1, p300, c-Kit, Bax, N-cadherin, Raf-1, CDC42, EIF4B, TNF-α, vimentin, OPN, survivin, E-cadherin, TGF-β, p16, p27, WT1, Mesothelin, Cleaved Caspase-3, COX2, ATF-1, CREB, p21, NF-κB52, NF-κB50, calretinin, H-Ras, Bcl-6, K-Ras, alpha-tubulin, NF-κB p65, Myf-6, p15, ATR, Fas, SUMO-1, MetRS, Ep-CAM, FOXM1, Era, SYK, STAT1, Eg5, HIF-3α, RAD52, ATM, ABCG2, Bad-7, KLF6, CaMKKa, Topo IIa, p38, IL-1β, TERT, Ub, PR, Rap1, HCAM, Lyn, twist, TAP, patched, Erb,VEGF, GLI-3, FGF-7, p63, SK3, rhoB, WNT-1, TDP1, SLUG.Underlines indicate detectable expression in either tumor or normal tissues.All phosphorylation state-specific antibodies were obtained from Cell Signaling Technology (Danvers, MA), except p-HGFR/C-Met (Y1234/Y1235) and p-HGFR/C-Met (Y1003), which were purchased from R&D Systems (Minneapolis, MN).All non-phospho-antibodies were obtained from Santa Cruz Biotechnology (Santa Cruz, CA), except the following antibodies: 1) ERK, Akt, beta-catenin, Notch4, CREB, Cleaved Caspase-3, EIF4B, NF-κB52, NF-κB50, and STAT1 were obtained from Cell Signaling Technology (Danvers, MA); 2) XIAP was obtained from BD Biosciences (San Jose, CA); 3) TGF-β was obtained from R&D Systems (Minneapolis, MN).TFG, transforming growth factor. Open table in a new tab Underlines indicate detectable expression in either tumor or normal tissues. All phosphorylation state-specific antibodies were obtained from Cell Signaling Technology (Danvers, MA), except p-HGFR/C-Met (Y1234/Y1235) and p-HGFR/C-Met (Y1003), which were purchased from R&D Systems (Minneapolis, MN). All non-phospho-antibodies were obtained from Santa Cruz Biotechnology (Santa Cruz, CA), except the following antibodies: 1) ERK, Akt, beta-catenin, Notch4, CREB, Cleaved Caspase-3, EIF4B, NF-κB52, NF-κB50, and STAT1 were obtained from Cell Signaling Technology (Danvers, MA); 2) XIAP was obtained from BD Biosciences (San Jose, CA); 3) TGF-β was obtained from R&D Systems (Minneapolis, MN). TFG, transforming growth factor. The background was locally subtracted from raw protein signal and the background subtracted intensity was normalized by the “global median subtraction” normalization method to reduce the variations arising from experimental results derived from different runs (such as transferring and blotting efficiency, total protein loading amount, and exposure density). In detail, for each protein, its intensity was divided by total intensities of all proteins from each sample, and then multiplied by average intensities of all proteins in all samples. The normalized data were transformed to log2 and were used in the subsequent statistical analysis. Total proteins were extracted from 50 fresh frozen gastric cancer samples, as previously described. There were 20 μg of proteins that were fractionated by electrophoresis through a 10% SDS-polyacrylamide gel electrophoresis, and then the proteins were transferred onto a nitrocellulose membrane. The membrane was incubated with the primary antibodies, including Akt (1:1000 dilution) and cyclin-dependent kinase 2 (1:1000 dilution) (both from Santa Cruz Biotechnology, Santa Cruz, CA) at 4°C overnight. The membrane was then incubated with a secondary anti-rabbit antibody conjugated with horseradish peroxidase (Amersham, Arlington Height, IL). The protein was detected using chemiluminescence method and chemiluminescent signals were captured using the ChemiDoc XRS System (Bio-Rad), as previously described. The same membrane was then blotted using a monoclonal anti-β-actin antibody (1:10,000 dilution; Sigma, St. Louis, MO). The signal of each protein was determined by densitometric scanning (Quantity One software package, Bio-Rad). Paired Student's t-test and Significant Analysis of Microarray (SAM) tool (http://www-stat.stanford.edu/∼tibs/SAM) were used to select the proteins differentially expressed between tumors and normal tissues. K-fold cross validation (K = 10) was used to select those proteins with a great discriminating power to distinguish tumors from normal tissues. K-fold cross validation and unsupervised hierarchical clustering analysis were performed using BRB Array Tools software v.3.3.0 (http://linus.nci.nih.gov/BRB-ArrayTools.html). SPSS v.17.0 software (SPSS Inc., Chicago, IL) was used for Cox proportional hazard regression analysis to correlate the Protein Pathway Array data with the clinical data (TNM and survival), as well as for Kaplan-Meier and log-rank analysis of overall survival. There were 22 (of 142) proteins found to be differentially expressed between tumors and normal tissues in the training set (37 paired samples) using paired t-test and SAM analysis (P < 0.05 or q < 5%) (see Supplemental Table S1 at http://ajp.amjpathol.org) (Figure 2). Among them, 9 proteins and phosphoproteins were up-regulated in tumors, including proliferating cell nuclear antigen (PCNA), Notch4, CDK4, CDK6, XIAP, p-protein kinase C (PKC)α/βII, Akt, β-catenin, and p-PKCα, and 13 proteins were down-regulated in tumors, including p-ERK, cyclin B1, cyclin E, p27, E-cadherin, Hypoxia-inducible factor (HIF)-3α, Cdc25B, NF-κB52, TDP1, SK3, NF-κB50, SRC-1, and cyclin D1. To identify a robust set of proteins for classification, we carried out supervised K-fold cross validation (K = 10) using two class prediction models, including a support vector machine (SVM) and 3-nearest neighbor (3NN). Nine proteins (PCNA, Notch 4, p-ERK, CDK6, X-linked inhibitor of apoptosis (XIAP), CDK2, Akt, β-catenin, and NF-κB52) with the value of P < 0.01 were selected to build the SVM predictor. Five proteins (PCNA, Notch 4, p-ERK, CDK6, and XIAP) were selected to build the 3NN predictor. Ninety seven percent of the samples in the training set were correctly classified by either SVM or 3NN modeling. Only two samples (1 pair) in the training set were misclassified with this model. To further confirm the ability of these nine proteins to classify gastric cancer, we tested these proteins using a separate validation set of specimens (19 pairs) by 3NN and SVM models as previously described. All samples in the validation set were correctly classified by 3NN modeling (100% sensitivity and specificity), but 1 pair of the samples were misclassified by SVM modeling (95% sensitivity and specificity). A two-way hierarchical clustering analysis was performed for both sets of samples and revealed distinct patterns for both training sets (Figure 3A) and validation set (Figure 3B), although several samples were misclassified. We also compared the protein expression pattern between two histology grades (moderately and poorly differentiated tumors), and no significant difference was found.Figure 3Hierarchical clustering analysis of differentially expressed proteins in paired tumor and normal samples. The expression profile of nine proteins between the paired tumor and normal samples in training set (n = 37) (A) and validation set (n = 19) (B). The color scale showed the level of expression. Red indicates overexpression and green indicates underexpression, black indicates no change, and gray no expression. The number in each column represents the sample number. Each row represents a protein.View Large Image Figure ViewerDownload Hi-res image Download (PPT) To identify molecular markers to predict gastric cancer behaviors (ie, invasion and lymph node metastasis), we applied the SAM tool to identify proteins differentially expressed among different tumor groups. Based on the pathology report, we classified the level of tumor invasion into four (T stage) groups: 1) T1 stage group (mucosa/submucosa), 2) T2 stage group (muscularis propria/subserosa), 3) T3 stage group (serosa without invasion of adjacent structures), and 4) T4 stage group (adjacent structures). For the ability of lymph node metastasis, we classified the tumors into N0 (no lymph node involvement), N1 (≤6 positive nodes), N2 (7 to 15 positive nodes), and N3 (≥16 positive nodes) (N stage). Among different levels of invasion, 10 differentially expressed proteins were identified by SAM analysis (P < 0.05). Five proteins (E-cadherin, NFkB50, HIF-3α, cyclin B1, and cyclin E) were differentially expressed between T1 and T2, and 10 proteins (E-cadherin, β-catenin, NF-κB50, HIF-3α, cyclin B1, cyclin E, XIAP, TDP1, SK3, and CDC25B) were differentially expressed between T1 and T3, and T1 and T4, as well as between T1 and T2, T1 and T3, and T1 and T4. No differentially expressed proteins were identified between T2 and T3, and T2, and T4. Among these, 10 proteins (E-cadherin, beta-catenin, NF-κB50, HIF-3α, cyclin B1, cyclin E, XIAP, TDP1, SK3, and CDC25B) were differentially expressed between T1 and combined T3 and T4 as well as between T1 and combined T2, T3, and T4. No differentially expressed proteins were identified between T2 and combined T3 and T4. Among these 10 proteins, 4 proteins (E-cadherin, CDC25B, HIF-3α, and cyclin B1) were selected as the best predictors by K-fold cross-validation (K=10) analysis (with p<0.05) to distinguish T1 (early cancer) and combined T2, T3, and T4 (advanced cancer). Two-way hierarchical clustering analysis by BRB Array Tools software using these four proteins separated 56 tumors into two main groups: 23 tumor samples into group A and the 33 rest of the samples into group B (Figure 4). It is worthy to note that all six T1 tumors and six T2 tumors (of 12) were classified into group A and five of six T1 tumors were clustered into one subgroup. Twenty-six T3 and T4 tumors (78.8%) were clustered into group B, but only 11 T3 and T4 tumors (47.8%) were clustered into group A (χ2 = 5.796; P = 0.016). The results suggest that group A tumors represent a biologically less invasive cancer. Of these four proteins, two (cyclin B1 and CDC25B) were up-regulated and two (HIF-3α and E-cadherin) were down-regulated in group A tumors, suggesting these proteins are associated with invasive behavior of the gastric cancer. Among different N stages, four differentially expressed proteins were identified by SAM analysis (P < 0.05), including PCNA, NF-κB50, Notch 4, and CDK6. PCNA was down-regulated in N1 tumors when compared with N0 tumors. NF-κB50 was down-regulated in N2 tumors when compared with N1 tumors. Notch4 and CDK6 were down-regulated in N3 tumors and NF-κB50 was up-regulated in N3 tumors when compared with N2 tumors. These data suggest that these four proteins may be associated with lymph node metastasis. In addition, two proteins (HIF-3α and p-PKC α/β II) were found to be associated with vascular invasion of gastric cancer. Of these, HIF-3α was up-regulated in the tumors with vascular invasion (P = 0.042), whereas p-PKC α/β II was down-regulated in the tumors with vascular invasion (P = 0.042). To identify proteins that may predict overall survival, a univariate Cox proportion hazard regression analysis was performed on the 22 differentially expressed proteins in gastric cancer in a cohort of 93 patients (Table 1 and Figure 1). Two proteins (CDK2 and Akt) were found to correlate with overall survival with hazard ratios of 1.293 [P = 0.036; 95% confidence interval (CI): 1.017 to 1.644] and 1.431 (P = 0.028; 95% CI: 1.039 to 1.971), respectively. To determine whether these proteins can be independent prognostic markers, a multivariate analysis was performed taking into consideration other clinical parameters, such as age, sex, family history, histology grade, vascular invasion, and TNM stage (Table 3). The data showed that CDK2 and Akt still stood as independent predictors with hazard ratios of 1.289 (P = 0.044, 95% CI: 1.011 to 1.644) and 1.572 (P = 0.011, 95% CI: 1.111 to 2.224), respectively. In addition, age at surgery (P = 0.008) and TNM staging (P = 0.011) were also independent predictors of survival (Table 3). Based on the hierarchical clustering analysis of CDK2 and Akt expression, the tumor samples were separated into either high or low expression groups. The group with high level expression of CDK2 or Akt associated with a poorer prognosis according to Kaplan-Meier and log-rank survival analysis (P = 0.01 and P = 0.03, respectively