Abstract 4729: Pathway and Gene Set Analyses for Epithelial Ovarian Cancer (EOC) Genome-Wide Association Study (GWAS)
Yian A. Chen,Zhihua Chen,Ya-Yu Tsai,Xiaotao Qu,Edwin Iversen,Jill Barnholtz-Sloan,Brooke L. Fridley,Jenny Permuth-Wey,Harvey Risch,Julie M. Cunningham,Robert A. Vierkant,David Fenstermacher,Rebecca Sutphen,Catherine M. Phelan,Alvaro N. Monteiro,Michael J. Birrer,Daniel W. Cramer,Steven A. Narod,John McLaughlin,Joellen M. Schildkraut,Ellen L. Goode,Thomas A. Sellers
DOI: https://doi.org/10.1158/1538-7445.am2011-4729
IF: 11.2
2011-01-01
Cancer Research
Abstract:Abstract The etiology of ovarian cancer is poorly understood but there is clearly a heritable component. Efforts to identify susceptibility alleles rarely consider interactions among alleles or their joint effects because it quickly becomes computationally intractable. In this study, we sought to identify multi-SNP effects jointly with a pathway-based analysis (GSEA-SNP) of 1,952 EOC cases and 2,042 frequency-matched controls genotyped with the Illumina 610K array. All subjects were self-reported non-Hispanic non-Jewish Caucasians, with SNP and sample call rates > 95%. Subjects with ambiguous gender, unresolved identical genotypes and < 80% European ancestry were excluded. SNPs with MAF < 1% were excluded. Missing genotypes were inferred using Mach based on the HapMap CEU population. SNPs not within introns were annotated to genes within 100 bp. We retrieved the following databases of gene sets (GSs) from a compiled database, MsigDB: human chromosome and cytogenetic band, chemical and genetic perturbations, canonical pathways, microRNA binding targets, transcription factor targets (TFT), cancer gene neighborhood (CGN), Cancer modules and Gene Ontology. The GSEA-SNP approach calculates and rank orders the trend statistic for association between each SNP and EOC risk. The Enrichment Score (ES) estimates the overrepresentation of top-ranked SNPs for each GS; the statistical significance was estimated using 10,000 permutations. The ES was normalized according to the size of GS to yield Normalized Enrichment Score (NES). The false discovery (FDR) rate was estimated using NES to adjust for multiple hypothesis testing. A total of 5181 gene-sets were included in the analysis. When controlling the FDR at 15%, 14 of the GSs were highly enriched with association signals, including two chromosomal regions, 8q13 (p = 0.0055, FDR = 0.15) and 6p24 (p = 0.0012, FDR = 0.08). BIOCARTA_LYM_PATHWAY, a pathway related to cell adhesion and diapedesis of lymphocytes, was also significant (p = 0.0002, FDR = 0.12). A TFT GS, composed of genes with promoter regions around transcription start sites containing the motif GGCNRNWCTTYS, was associated with risk (p = 0.0004, FDR = 0.08). Currently, no known transcription factors bind to this computationally predicted motif. Molecular function of double stranded RNA binding was also enriched (p = 0.005, FDR = 0.10). The remaining enriched sets were CGN sets, for which the neighborhoods around the cancer genes were originally defined using correlation of gene expression from 4 large data sets of various cancer types. The enriched CGN included: CD48 (p = 0.011), CD53 (p = 0.009), CD97 (p = 0.011), INPP5D (p = 0.010), PTPN6 (p = 0.008), VAV1 (p = 0.011), ITGAL (p = 0.012), PTPRC (p = 0.012), and STAT6 (p = 0.010). In summary, these analyses detected biologically plausible GSs related to etiology of EOC, highlighting SNPs in core enrichment groups that were not identified using individual SNP tests. Citation Format: {Authors}. {Abstract title} [abstract]. In: Proceedings of the 102nd Annual Meeting of the American Association for Cancer Research; 2011 Apr 2-6; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2011;71(8 Suppl):Abstract nr 4729. doi:10.1158/1538-7445.AM2011-4729