Action Filename Size Access Description License
Show more files...


In the United States, breast cancer is the most frequently diagnosed non-skin cancer in women, and one in five women who are diagnosed develop breast cancer before age 50. Germline genetic variation is a known risk factor for breast cancer risk, and a suspected risk factor for breast cancer mortality, but previous investigations have not comprehensively identified all of the genetic variation that is expected to be associated with breast cancer. One possible explanation for this gap in knowledge is the only relatively recent ability to investigate the effect of rare germline genetic variation, which up until recently has been too expensive and technically challenging to measure in the a large number of participants that are necessary for genetic epidemiologic studies, and the methodological challenges of identifying rare variants. This thesis uses three complementary methods (single marker regression analysis, SKAT-O gene-based tests, and candidate gene) to identify individual risk loci and three additional complementary methods (Kriging whole genome prediction, polygenic risk scores, and whole genome heritability estimates) to predict breast cancer risk and breast cancer mortality using a population of women who were diagnosed with breast cancer before the age of 50. Suggestively associated risk loci were examined for evidence of replication using an independent sample. For breast cancer risk, the identification analyses find three genes in which variation is associated with risk of breast cancer: FGFR2 (discovery p=2.18e-5; replication p<1e-30), NEK10 (discovery p=1.20e-3; replication p<1e-30), and MKL1 (discovery p=2.62e-4; replication p<p<1e-30). Previous studies had identified loci near each of these genes as being associated with breast cancer risk, but conditional analyses indicate that the associations in the MKL1 and NEK10 genes are driven by risk loci distinct from those previously reported, and are driven by risk loci that would not have been identified using a single variant regression. The genetic data alone is able to predict breast cancer risk with an AUC of 0.618 (95% CI 0.610-0.629). When the influence of a limited set of non-genetic predictors is also incorporated, the combined model is able to predict breast cancer risk with an AUC of 0.655 (95% CI: 0.649-0.660). This combined model is a significant improvement over models that include only the genetic information or only the non-genetic risk factors. In contrast to the analyses of the genetic determinants of breast cancer development, this analysis does not find any compelling evidence that breast cancer mortality is strongly driven by germline genetics that could be measured by our study. The identified genes all represent possible pharmacological targets for cancer chemoprevention. The prediction model for breast cancer risk improves upon existing methods of prediction, and is strong enough to be useful at the population level. From a clinical perspective, the model still has low levels of discrimination, but may be strong enough to be used in very specific scenarios, such as interpretation of ambiguous screening results, or to help individuals to understand their personal risk when considering other medical treatments that may increase the risk of breast cancer such as hormone replacement therapy or hormone-assisted reproductive therapy. In the context of breast cancer prognosis, these investigations support other lines of evidence that suggest that for many women who are diagnosed with breast cancer, germline genetic variation does not strongly influence the risk of mortality.


Additional Details



Downloads Statistics