Genetic variation at CYP 3 A is associated with age at menarche and breast cancer risk : a case-control study
N. Johnson,F. Dudbridge,N. Orr,L. Gibson,Michael E. Jones,M. Schoemaker,E. Folkerd,B. Haynes,J. Hopper,M. Southey,G. Dite,C. Apicella,M. Schmidt,A. Broeks,Laura J. Van
Abstract:Introduction: We have previously shown that a tag single nucleotide polymorphism (rs10235235), which maps to the CYP3A locus (7q22.1), was associated with a reduction in premenopausal urinary estrone glucuronide levels and a modest reduction in risk of breast cancer in women age ≤50 years. Methods: We further investigated the association of rs10235235 with breast cancer risk in a large case control study of 47,346 cases and 47,570 controls from 52 studies participating in the Breast Cancer Association Consortium. Genotyping of rs10235235 was conducted using a custom Illumina Infinium array. Stratified analyses were conducted to determine whether this association was modified by age at diagnosis, ethnicity, age at menarche or tumor characteristics. Results: We confirmed the association of rs10235235 with breast cancer risk for women of European ancestry but found no evidence that this association differed with age at diagnosis. Heterozygote and homozygote odds ratios (ORs) were OR = 0.98 (95% CI 0.94, 1.01; P = 0.2) and OR = 0.80 (95% CI 0.69, 0.93; P = 0.004), respectively (Ptrend = 0.02). There was no evidence of effect modification by tumor characteristics. rs10235235 was, however, associated with age at menarche in controls (Ptrend = 0.005) but not cases (Ptrend = 0.97). Consequently the association between rs10235235 and breast cancer risk differed according to age at menarche (Phet = 0.02); the rare allele of rs10235235 was associated with a reduction in breast cancer risk for women who had their menarche age ≥15 years (ORhet = 0.84, 95% CI 0.75, 0.94; ORhom = 0.81, 95% CI 0.51, 1.30; Ptrend = 0.002) but not for those who had their menarche age ≤11 years (ORhet = 1.06, 95% CI 0.95, 1.19, ORhom = 1.07, 95% CI 0.67, 1.72; Ptrend = 0.29). Conclusions: To our knowledge rs10235235 is the first single nucleotide polymorphism to be associated with both breast cancer risk and age at menarche consistent with the well-documented association between later age at menarche and a reduction in breast cancer risk. These associations are likely mediated via an effect on circulating hormone levels. Introduction Family history is a well-established risk factor for breast cancer. First-degree relatives of women with breast cancer have an approximately twofold increased risk of developing the disease relative to the general population [1]. Twin studies are consistent with this familial clustering having, at least in part, a genetic origin [2,3]. Mutations in high-risk susceptibility genes (mainly BRCA1 and BRCA2) explain most large multiple-case families, but account for only 15 to 20% of the excess familial risk [4]. Genome-wide association studies [5,6] have identified more than 70 common variants that are associated with breast cancer susceptibility but they account for only another approximately 15% of the excess familial risk. The so-called ‘missing heritability’ may be explained by common variants with very small effects and/or by rarer variants with larger effects, neither of which can be identified by current genome-wide association studies. A statistically efficient alternative is to increase power by trying to identify variants associated with known quantitative phenotypic markers of susceptibility to breast cancer [7], and then to test them for association with breast cancer risk. This approach might also improve our understanding of the biological mechanisms involved in breast cancer pathogenesis. Endogenous sex hormones are well-established risk factors for breast cancer in postmenopausal women [8]; the evidence in premenopausal women is less consistent, with some, but not all, studies suggesting an association between higher circulating levels of estrogens and increased breast cancer risk [9-17]. Genetic factors influence the levels of endogenous sex hormones [18] and therefore single nucleotide polymorphisms (SNPs) in genes regulating these hormonal pathways are good candidates for being breast cancer predisposition variants. We have previously studied 642 SNPs tagging 42 genes that might influence sex hormone levels in 729 healthy premenopausal women of European ancestry in relation to cyclic variations in oestrogen levels during the menstrual cycle. We found that the minor allele of rs10273424, which maps 50 kb 3′ to CYP3A5, was associated with a reduction of 22% (95% confidence interval (CI) = –28%, – 15%; P = 10) in levels of urinary oestrone glucuronide, a metabolite that is highly correlated with serum oestradiol levels [19]. Analysis of 10,551 breast cancer cases and 17,535 controls of European ancestry demonstrated that the minor allele of rs10235235, a proxy for rs10273424 (r = 1.0), was also associated with a weak reduction in Johnson et al. Breast Cancer Research 2014, 16:R51 Page 3 of 13 http://breast-cancer-research.com/content/16/3/R51 breast cancer risk but only in women aged 50 years or younger at diagnosis (odds ratio (OR) = 0.91, 95% CI = 0.83, 0.99; P = 0.03) [19]. The aim of the present study was to further investigate an association between rs10235235 and breast cancer risk using a much larger set of subjects – the Breast Cancer Association Consortium (BCAC) – comprising data from 49 additional studies, and to assess whether there was evidence of effect modification by age at diagnosis, ethnicity, age at menarche or tumour characteristics. Materials and methods Sample selection Samples for the case–control analyses were drawn from 52 studies participating in the BCAC: 41 studies from populations of predominantly European ancestry, nine studies of Asian ancestry and two studies of AfricanAmerican ancestry. The majority were population-based or hospital-based case–control studies, but some studies were nested in cohorts, selected samples by age, oversampled for cases with a family history or selected samples on the basis of tumour characteristics (Table S1 in Additional file 1). Studies provided ~2% of samples in duplicate for quality control purposes (see below). Study subjects were recruited on protocols approved by the Institutional Review Boards at each participating institution, and all subjects provided written informed consent (Additional file 2). Genotyping and post-genotyping quality control Genotyping for rs10235235 was carried out as part of a collaboration between the BCAC and three other consortia (the Collaborative Oncological Gene-environment Study (COGS)). Full details of SNP selection, array design, genotyping and post-genotyping quality control have been published [5]. Briefly, three categories of SNPs were chosen for inclusion in the array: SNPs selected on the basis of pooled genome-wide association study data; SNPs selected for the fine-mapping of published risk loci; and candidate SNPs selected on the basis of previous analyses or specific hypotheses. rs10235235 was a candidate SNP selected on the basis of our previous analyses [19]. For the COGS project overall, genotyping of 211,155 SNPs in 114,225 samples was conducted using a custom Illumina Infinium array (iCOGS; Illumina, San Diego, CA, USA) in four centres. Genotypes were called using Illumina’s proprietary GenCall algorithm. Standard quality control measures were applied across all SNPs and all samples genotyped as part of the COGS project. Samples were excluded for any of the following reasons: genotypically not female XX (XY, XXY or XO, n = 298); overall call rate <95% (n = 1,656); low or high heterozygosity (P < 10, separately for individuals of European, Asian and African-American ancestry, n = 670); individuals not concordant with previous genotyping within the BCAC (n = 702); individuals where genotypes for the duplicate sample appeared to be from a different individual (n = 42); cryptic duplicates within studies where the phenotypic data indicated that the individuals were different, or between studies where genotype data indicated samples were duplicates (n = 485); first-degree relatives (n = 1,981); phenotypic exclusions (n = 527); or concordant replicates (n = 2,629). Ethnic outliers were identified by multidimensional scaling, combining the iCOGS array data with the three Hapmap2 populations, based on a subset of 37,000 uncorrelated markers that passed quality control (including ~1,000 selected as ancestry informative markers). Most studies were predominantly of a single ancestry (European or Asian), and women with >15% minority ancestry, based on the first two components, were excluded (n = 1,244). Two studies from Singapore (SGBCC) and Malaysia (MYBRCA; see Table S1 in Additional file 1 for all full study names) contained a substantial fraction of women of mixed European/Asian ancestry (probably of South Asian ancestry). For these studies, no exclusions for ethnic outliers were made, but principal components analysis (see below) was used to adjust for inflation in these studies. Similarly, for the two African-American studies (NBHS and SCCS), no exclusions for ethnic outliers were made. Principal component analyses were carried out separately for the European, Asian and African-American subgroups, based on a subset of 37,000 uncorrelated SNPs. For the analyses of European subjects, we included the first six principal components as covariates, together with a seventh component derived specific to one study (LMBC) for which there was substantial inflation not accounted for by the components derived from the analysis of all studies. Addition of further principal components did not reduce inflation further. Two principal components were included for the studies conducted in Asian populations and two principal components were included for the African-American studies. For the main analyses of rs10235235 and breast cancer risk, we excluded women from three studies (BBCS, BIGGS and UKBGS) that were genotyped in the hypothesis-generating study (n = 5,452) [19] and women with non-invasive cancers (ductal carcinoma in situ/lobular carcinoma in situ, n = 2,663) or cancers of uncertain status (n = 960)). After