Building Prognostic and Predictive Models for Cancer Patients Using Expression Modules and Clinical Variables.
C. Fan,J. Parker,C. M. Perou
DOI: https://doi.org/10.1158/0008-5472.sabcs-5063
IF: 11.2
2009-01-01
Cancer Research
Abstract:Abstract Abstract #5063 Background: Prognostication for breast cancer patients is improving, in part through the use of genomic predictors. However, most predictors only utilize one data type that is gene expression data, and thus prognostication that simultaneously uses multiple data types may result in even more accurate predictors. Methods: We recently developed a unique risk of relapse predictor based upon a Cox proportional hazards model that contained gene expression (5 intrinsic subtypes) and clinical variables (tumor size and node status)(Parker ASCO 2008 abstract #11008). This model was accurate in predicting 7 year relapse probabilities and out performed the genomic or clinical variables only models. Building upon this combined genomic plus clinical parameters approach, we reasoned that including other expression modules indicative of pathway activation and cell type signatures might further improve our models. Therefore we build a collection of >200 expression modules using 1) unsupervised hierarchical and bi-clustering analysis 600 human tumors, 2) similar cluster analyses of 250 mouse mammary tumors, 3) >50 published breast tumor genomic profiles, and 4) signaling pathway activation modules derived from our own work. Each signature, or module, was then used with LASSO variable selection to determine if prognostic Cox Proportional Hazards models could be built using a local therapy and tamoxifen only treated patient set. Results: These analyses show that highly significant prognostic models could be built for 1) all patients, 2) ER-positive patients, and 3) Luminal patients. Less predictive, yet significant models were created Basal-like patients, and no successful models could be developed for ER-negative patients, or for the expression-defined HER2-enriched subtype. For most significant predictors, the Cox model contained human-derived modules, mouse-derived modules, and pathology variables; for example the predictor for ER-positive patients contained human modules for proliferation, hypoxia, an embryonic stem cell derived signature, three mouse modules with one containing multiple metalloproteinases, tumor size and node status. In addition, the actual modules selected for the different patient subsets varied such that the modules selected for the Basal-like patients did not overlap any module selected for all patients, or for the ER-positive patient subset. Conclusions: These analyses further reinforce the idea that breast cancer is a heterogeneous disease and suggest that prognostic models for each important patient subset should be developed and used. Models that contain treatment variables, in addition to prognostic terms, are also being developed and will be presented. In summary, these analyses argue that the best prognostic algorithms for breast cancer patients contain biological variables coming for humans, from model systems, and from classic pathology. Citation Information: Cancer Res 2009;69(2 Suppl):Abstract nr 5063.